The Holistic Librarian – Thing 11

Hi, I’m Diane Workman and I’m the Subject Librarian for Drama, English, Film Studies and Theology & Religion.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 11 was to research the answer to the question “What advice could you give to a researcher about backing-up his research data?”

What I knew about the topic beforehand:

I was aware of the importance of making regular back-ups of data to protect against accidental or malicious data loss, and of ensuring that one copy was stored at a different location to the rest. I was also vaguely aware that Exeter IT provide a back-up service, but was unfamiliar with the detail.

What I know now:

Making back-ups of files is an essential element of data management and should be built into a researcher’s Data Management Plan. The back-up procedure followed, and the regularity with which it is carried out, will depend upon the perceived value of the data (e.g. is it unique?) and the levels of risk considered appropriate by the researcher.

Some things that need to be considered include:

  • Institutional back-up policy
    This will determine what the researcher will be responsible for backing-up themselves, e.g. personal laptops, home PCs. Exeter IT provide more advice on the back-up options available to staff and students of the University on their web pages. See

Researchers at Exeter can also deposit their data in the Exeter Data Archive (EDA) for long-term preservation.

  • Which storage media to use
    This will depend on the quantity and type of data. Memory sticks, CD/DVD or remote, online back-up services are considered better for small amounts of data. Hard drives (networked and removable) and magnetic tapes are considered more suitable for large volumes of data.
  •  Which file format to use
    The format chosen should be suitable for long-term digital preservation.
  • Regular validation of back-ups
    It’s important to test back-ups at regular intervals for completeness and integrity. They can be checked by matching file size, dates and checksum/hash tags against the original file. MD5 is a widely used checksum which can be used to verify whether two files are identical. More information on MD5 is available on the UK Data Archive website.

Further information on the majority of these points can be found on the websites mentioned in the section below.

How did I obtain this knowledge?:

The UK Data Archive at the University of Essex is something of an authority on this topic, and had a useful webpage devoted to backing-up data.

The University of Glasgow has a set of useful webpages on data management support for researchers, including one dealing with data back-up.

Information about arrangements local to the University of Exeter can be found on the following Exeter IT web page:

What else would I like to know about this topic:

I don’t begin to understand how the MD5 checksum works! I would like to find out more about this and perhaps see it in action.

How did I find this task? How would I improve it?

It was fine. There’s enough specific advice available on authoritative websites – much more than I could convey in this post.

Posted under Holistic Librarian, Training

This post was written by Diane Workman on January 24, 2013

Tags: ,

1 Comment so far

  1. Hannah Lloyd-Jones January 29, 2013 09:33

    Hi Diane,

    Thanks for this! Did you see that there is a MD5 checksum exercise on the UKDA page: and I think you can download a free trial of the software here if you want to test it out:

    Best wishes,


More Blog Post