The Holistic Librarian – Thing 20

Hello. I’m Caroline Huxtable, the Subject Librarian for Computer Science, Engineering, Mathematics, Medical Imaging and Physics.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 20 was to research the answer to the question: A researcher wants to archive sensitive research data securely for long-term preservation. What options does she have?

What I knew about the topic before the task:

I was aware that research outputs may contain sensitive data that must be securely stored and preserved. I understood sensitive data to be:

  • personal data relating to individuals.
  • data which allows individuals to be identified.
  • confidential data.
  • data which is considered commercially sensitive such as trade secrets.

What I know now:

The UK Data Protection Act 1998 defines sensitive personal data as relating to matters including racial or ethnic origin, political and/or religious beliefs, and physical or mental health. The Act applies only to personal or sensitive personal data, and not to all research data in general, nor to anonymised data. I found it difficult to find any satisfactory definition of sensitive data more generally (rather than specifically personal data).

Sensitive data may exist in both digital and non-digital formats.

Issues to consider in relation to the secure archiving of sensitive research data include:

  • Physical security of the data, such as controlling access to rooms and buildings where data is stored and transporting sensitive (non-digital) data only when absolutely necessary.
  • Network security, e.g. not storing sensitive data on servers or computers connected to an external network, and ensuring that firewalls and other security protection are in place and kept updated.
  • Security of computer systems and files, including:
    • password protection, using complex passwords and changing them regularly.
    • implementing administrator-only permissions to access some or all of the data as appropriate (the fine detail of these permissions would need to be considered in the light of how many individuals are working with the data, and whether each person requires access to all the data or just sub-sets thereof).
    • encryption of files.
    • imposition of non-disclosure agreements for users of confidential data.
    • not exchanging sensitive data via the cloud or email unless it has been encrypted.
    • ensuring that secure measure are used when data is destroyed or devices that hold or have accessed such data are disposed of.
  • Data containing personal information is governed by the Data Protection Act, which allows personal data to be accessible only to authorised persons. It must be treated with higher security than data that does not contain personal information. This can be achieved by, for example, anonymising or aggregating data pertaining to individuals, or by storing personal information separately from the rest of the related data.
  • Whether certain data should be permanently deleted from the archived datasets so as to avoid accidental discoverability.
  • Whether data should be archived in a dark archive, i.e. preserved long-term, but not publicly discoverable or accessible.

How I obtained this knowledge:

I primarily consulted the website of the UK Data Archive, in particular the section on data security. I also got some tips from a presentation given as part of the Open Exeter project by Caroline Dominey, the University Records Manager, on data protection, storage and sharing. Additionally I looked at the University’s Information Security Policy.

What else I would like to know about the topic:

I would like to have a greater understanding of what constitutes sensitive data. As stated above, I struggled to find a good definition, and am therefore unsure whether my answer is as comprehensive as it could be in relation to the secure preservation of such data. I would welcome expert guidance on where to find further information; a training session would be ideal, as I learn better in such an environment rather than via the self-directed learning method.

How I found the task and how I would improve it:

I found the question ‘What options does she have’? slightly ambiguous in meaning. I took it to mean (and answered) ‘what issues does she need to consider in relation to the security of sensitive data’?

Posted under Holistic Librarian, Training

This post was written by Caroline Huxtable on January 7, 2013

1 Comment so far

  1. Hannah Lloyd-Jones January 8, 2013 16:31

    Hi Caroline,

    Your comments on Thing 20 are really useful – The question “What options does she have?” was intended to refer to where a researcher could archive this data (e.g. could this be archived on a Open Access repository or would it have to be on a dark archive, or even deleted) and how she could do this (e.g. if she anonymised the data set it would be possible to put it on OA, but would this dataset still be useful?). Next time round, we will reword the question so it is clearer.

    Another option that the researcher may have would be to put a metadata record on OA and ask for interested parties to contact her directly should they wish to reuse the data set. She could ask the data user to sign a data access agreement so that the user would have to respect certain terms and conditions of data use. This would depend on the type of sensitive data in question and the conditions under which the data was collected e.g. if it was a personal interview and the interviewee gave permission for the transcript to be used but only for other research projects.

    With regards to the definition of sensitive research data, this is a term that we will add to our glossary which will be available online as part of the guidance that we are developing. I don’t think that there is one set definition of the term, but as you note in your introduction, it can include commercially sensitive data as well as personal sensitive data, and data on national security issues etc.

    All the best,

    Hannah

More Blog Post