The Holistic Librarian – Thing 21

Hello. I’m Caroline Huxtable, the Subject Librarian for Computer Science, Engineering, Mathematics, Medical Imaging and Physics.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 21 was to research the answer to the question: Which criteria could a researcher use to select which research data he needs to preserve in the long-term?

What I knew about the topic before the task:

I knew that it is important for researchers to preserve their data long-term for potential future use by the wider research community, in order to contribute to the advancement of knowledge. I also knew that increasingly it is a requirement of funding bodies that data arising from publicly-funded research is preserved and made openly available  as a public good, with ‘as few restrictions as possible in a timely and responsible manner that does not harm intellectual property’ (Source of quote: RCUK Common Principles on Data Policy). Additionally, I was aware that it is vital to select which data should/can be preserved, as impractical to preserve and provide access to all data, not least because of cost implications.

What I know now:

It is good practice, and increasingly a requirement of funders, that – at the outset of a project or even at the grant application stage – researchers create and implement a Data Management Plan, which typically includes information on ‘what data will be created and how, and outlines the plans for sharing and preservation, noting what is appropriate given the nature of the data and any restrictions that may need to be applied’. [Source: Digital Curation Centre (DCC) website; section on Data Management Plans].

The plan should specify, for example:

  • which data will be preserved.
  • whether any of the data will be deleted prior to archiving (see Things 19 and 20 for further comments on this issue).
  • which type(s) of data will be preserved (raw data, derived data, samples etc.).

As well as stating the above in the Data Management Plan, the appraisal and selection of data for preservation will be an ongoing, iterative process as the research from which it derives progresses.

The criteria which the researcher could use to select which data to preserve long-term include:

  • whether the confidentiality and/or sensitivity of any of the data means that it cannot be archived, or can only be preserved in a dark archive.
  • any funding body or other legal requirements on which data must be archived and/or made accessible.
  • whether any non-disclosure agreements apply to any of the data.
  • does the data fit into the archive or repository’s selection policy?
  • the likely costs of preservation.
  • how significant is the data for research and/or scientific or social progress?
  • is the data unique?
  • is the data usable by others?
  • the volume of data and the available storage capabilities.
  • copyright and other legal rights pertaining to the data and its ability to be preserved and/or made accessible.
  • does the data constitute the ‘vital records’ of a project, and therefore need to be retained indefinitely?
  • technical issues, e.g. can the file format be maintained or transferred for future use?
  • is there adequate metadata to describe the data and ensure its discoverability?

How I obtained this knowledge:

What else I would like to know about the topic:

Much of the existing information that I discovered seems to be written for use by repository managers, librarians, data curators etc. to assist in the creation of data appraisal, selection and curation policies. So it would be beneficial to have available a checklist of criteria from the researcher’s perspective, to which I could point researchers when answering enquiries.

How I found the task and how I would improve it:

I found the task interesting and somewhat easier to research than Things 19 and 20, perhaps because there seems to have been more written and collected together than for those earlier tasks.

Posted under Holistic Librarian, Training

This post was written by Caroline Huxtable on January 7, 2013

Comments are closed.

More Blog Post

Next Post: