What is Research Data? Some end of term musings…

Following on from our workshop last week I have been pondering our use of language in the Open Exeter project. I won’t say it has been keeping me up at night but…

As we are funded under the Managing Research Data 2011-13 programme I thought I would briefly blog on Research Data. This may seem strange but it has become increasingly clear that this phrase means different things to different people. I should state here that in a past life I trained as a historian and, although I did use an element of quantitative analysis, at the time I would have said that I didn’t use data in my PhD; that was used by my friends in the sciences. I used text based sources, they used numbers and data was numerical! Clearly I did use data, I just didn’t realise it.

Is it important that we define the term though? Is “Research Data” just one of those phrases that people use and they know what they mean and the person they are talking to knows what they mean but the conversation may be unintelligible to outside ears? Personally, I believe that we should find common ground. How can we create generic training materials on Research Data Management if the very title holds differing meanings to different researchers and librarians?

Looking at training materials and policies that have already been created it is clear that the term has been defined slightly differently in each case. To take three examples:

  • In the glossary to the Cambridge Data Management materials they state, “We refer to ‘research data’ a lot on the Cambridge data management pages. We put nearly all research materials under this umbrella – so, yes, spreadsheets of statistics and equipment outputs are ‘data,’ but so might be research-related e-mails, drafts, interviews, analyses, footnotes, and references.”
  • On the Oxford 101 flyer ‘Managing your research data at the University of Oxford’ they state that “Research data can be textual, numerical, qualitative, quantitative, final, preliminary, physical, digital or print”, without actually defining the term.
  • The University of Hertfordshire Policy states that data is, “distinct units of information such as facts, numbers, letters, symbols, usually formatted in a specific way, stored in a database and suitable for processing by a computer”. Does this mean that data has to be electronic? Surely not.

The Australian National Data Service has an interesting guide on this very topic entitled “What is Research Data?” In this they state there are “recognised definitions of research data available”. Does this mean there is more than one definition? If so, what should be included? Does one researcher’s output (be it an article, conference proceeding, piece of software, law report etc.) become another researcher’s data? Is it too simplistic to think of inputs and outputs as different entities? Are they one and the same; just at a different stage of a research lifecycle?

I certainly don’t have answers to the questions I have posed here but hopefully by the end of the project I will have a better idea of what the answers may be. These are only initial thoughts and I would be interested to hear what other people think. Can we define the term “Research Data” and is it actually important that we do?

Posted under Follow the Data

This post was written by Gareth Cole on December 21, 2011

  1. Mansur Darlington January 3, 2012 13:08


    The view we developed in the ERIM Project (the precursor to REDm-MED) is that, in a restricted sense, Research Data are data that are descriptive of the research object, or are the object itself. However during the project it became clear that ‘managing research data’ requires the contextualization of those narrowly defined data; as a result, management implicates much more than the management of just these ‘data’.

    The approach we took was to identify and define the types of things (we called them ‘data records’ of which there are five principal types)which contain the research data and contextualizing data. It is these things and their relations which require management, rather than the data themselves per se. You will find details in the ERIM Project web (http://www.ukoln.ac.uk/projects/erim/)including links to our RDM Terminology and project documentation.

    I hope this doesn’t just add to the confusion!

    Best wishes,


