The Holistic Librarian – Thing 4: sharing means caring

Hi, I’m Afzal, your Subject Librarian for Arabic, and Politics.

Task 4: If a researcher came to you asking how they could share their research data with somebody external to the University what would you recommend?

What I knew about the topic beforehand:

That there are Facebook type networking sites for researchers of all disciplines such as www.researchgate.net where researchers can get to know one another’s work, and share meta-data and even collaborate to a limited extent but not in real time: this is through the usual means such as emailing, file attachments, web page sharing, perhaps you tube, LinkedIn.  Given the nature of research and links to funding there is reluctance to share Data in the way it is done within a department or institutional project.  

What I know now:

Whilst researchers can exchange their data via USB sticks, external hard drives even conventional post, there aren’t any data collaborative forums or ‘data internets’. This is because of the lack of standards, policies, and consensual sharing guidelines, differing copyright laws, university and institutional regulations which would satisfy researchers.

Sharing is practically limited to the researcher’s institution, where data sharing contracts would be in place.

How did you obtain this knowledge?

I came to the above answer by entering ‘data sharing researchers’ in Google. I didn’t really finding ‘Data Rooms’ where the world’s researchers come together and ‘chat data’ – (Of course, these data rooms would have protective security locks). The results invariably focused on individual organisations e.g. at Edinburgh. I was expecting some kind of inter-institutionally owned ‘data labs’.

What else would you like to know about the topic?

Where do we go from here given the emphases being placed on collaborative and group research at the funding level and Open Access at the output level. This surely cannot be limited to an individual institution anymore: for instance, how do researchers from Oxford, Bristol, and the Max Planck Institute share data – is progress being made to enable this?

How did you find this task? How would you improve it?

Disappointing answers owing to a lack of developed national strategy, which will not help us progress. To improve: provide ‘model’ responses to all the tasks.

Posted under Holistic Librarian, Training

This post was written by Afzal Hasan on January 25, 2013

Tags: ,

The Holistic Librarian – Thing 10

Hi, I’m Diane Workman and I’m the Subject Librarian for Drama, English, Film Studies and Theology & Religion.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 10 was to research the answer to the question: “A researcher has used a secondary data set in their research. In which circumstances would she be able to put this on Open Access?”

What I knew about the topic beforehand:

I was unsure where to begin with this one, but felt that I should explore issues of Intellectual Property Rights (in relation to the original data creator) and ethics (in relation to the consent provided by participants in the original data collection).

What I know now:

It’s important to know the copyright situation for the data set that is being re-used. The researcher should establish whether the data set is still covered by copyright, and who the copyright owner is. Once they know this, they can contact the copyright owner to seek permission to publish their data set on an Open Access (OA ) basis. It’s possible that the original creator may have made their material freely available for re-use in one or both of the following ways:

  • By applying a Creative Commons licence
  • By depositing in a data centre with an OA policy

It’s also important to be clear about what informed consent was obtained from the participants in the original data collection by the data creator. Any consent form signed by them should have outlined any likely re-uses of the data, ideally specifying publication of the data set in an OA repository. This relies heavily on the original data creator making good provisions for the sharing and future use of the data that they collect, something that researchers should consider when developing their Data Management Plan. The University of Glasgow acknowledges that the process of placing data sets on OA may not always be straightforward:

“There can be a tension between abiding by data protection legislation and ethical guidelines, whilst fulfilling funder and public expectations to make research results available.” [Source: University of Glasgow website; section on data protection legislation and ethics].

How did I obtain this knowledge?:

I consulted several websites during the research for this question. It required reading around research data management more generally, as it relates to several issues. The following websites were all useful:

The Digital Curation Centre, and in particular their section on digital curation
http://www.dcc.ac.uk/digital-curation/what-digital-curation

The Incremental Project, part of the JISC Managing Research Data programme
http://www.lib.cam.ac.uk/preservation/incremental/index.html

The University of Cambridge Support for Managing Research Data web pages, one of the Incremental Project partners
http://www.lib.cam.ac.uk/dataman/

The University of Glasgow Data Management Support for Researchers web pages, the other Incremental Project partner
http://www.gla.ac.uk/services/datamanagement/

The UK Data Archive at the University of Essex, and in particular their section on consent and ethics
http://www.data-archive.ac.uk/create-manage/consent-ethics

What else would I like to know about this topic:

It would be useful to know more about the implications of the Data Protection Act on the re-use of data by researchers.

How did I find this task? How would I improve it?

This was the most difficult of the tasks that I was set, as there is no clear answer. Anything relating to IPR is usually not straightforward, and was further complicated when combined with the ethical aspect of the question. No single source provided an answer to the question, so it was necessary to draw my own conclusions based on the range of information sources consulted.

Posted under Holistic Librarian

This post was written by Diane Workman on January 25, 2013

Tags: ,

The Holistic Librarian – Thing 9: a date with meta…

Hi, I’m Afzal, your Subject Librarian for Arabic, and Politics.

Task 9: What is the importance of documenting research data and metadata? Where can you find useful information on data documentation and metadata?

What I knew about the topic beforehand:

I understood that Research Data Documentation to mean the same as creating Metadata. This is data about the data, helping the research creator with an organisational skeleton to keep track of progress themself and for others to know the research is or will be in the domain.

What I know now:

Creating metadata is highly emphasised and de rigueur. It’s as important as the research data itself. Rigorous standards mean greater impact because it means easy citability.Good metadata also allows other researchers to verify and repeat the experiments, for example. This has major ethical dimensions.

Universities now have webpages emphasising the importance of metadata:

E.g. MIT http://libraries.mit.edu/guides/subjects/data-management/metadata.html

Cambridge: http://www.lib.cam.ac.uk/dataman/pages/metadata.html

Oxford: http://www.admin.ox.ac.uk/rdm/dmp/documentation/

A good introduction to the sine qua non of Metadata is found at the UK Data Archive site:

http://data-archive.ac.uk/create-manage/document

How did you obtain this knowledge?

Googling succeeded.

What else would you like to know about the topic?

Whether there’s a standard Exeter University toolkit for ensuring rigorous metadata is created.

How did you find this task? How would you improve it?

Straightforward and educative.

Posted under Holistic Librarian, Training

This post was written by Afzal Hasan on January 24, 2013

Tags: ,

The Holistic Librarian – Thing 11

Hi, I’m Diane Workman and I’m the Subject Librarian for Drama, English, Film Studies and Theology & Religion.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 11 was to research the answer to the question “What advice could you give to a researcher about backing-up his research data?”

What I knew about the topic beforehand:

I was aware of the importance of making regular back-ups of data to protect against accidental or malicious data loss, and of ensuring that one copy was stored at a different location to the rest. I was also vaguely aware that Exeter IT provide a back-up service, but was unfamiliar with the detail.

What I know now:

Making back-ups of files is an essential element of data management and should be built into a researcher’s Data Management Plan. The back-up procedure followed, and the regularity with which it is carried out, will depend upon the perceived value of the data (e.g. is it unique?) and the levels of risk considered appropriate by the researcher.

Some things that need to be considered include:

  • Institutional back-up policy
    This will determine what the researcher will be responsible for backing-up themselves, e.g. personal laptops, home PCs. Exeter IT provide more advice on the back-up options available to staff and students of the University on their web pages. See http://as.exeter.ac.uk/it/files/backup/

Researchers at Exeter can also deposit their data in the Exeter Data Archive (EDA) for long-term preservation.

  • Which storage media to use
    This will depend on the quantity and type of data. Memory sticks, CD/DVD or remote, online back-up services are considered better for small amounts of data. Hard drives (networked and removable) and magnetic tapes are considered more suitable for large volumes of data.
  •  Which file format to use
    The format chosen should be suitable for long-term digital preservation.
  • Regular validation of back-ups
    It’s important to test back-ups at regular intervals for completeness and integrity. They can be checked by matching file size, dates and checksum/hash tags against the original file. MD5 is a widely used checksum which can be used to verify whether two files are identical. More information on MD5 is available on the UK Data Archive website.

Further information on the majority of these points can be found on the websites mentioned in the section below.

How did I obtain this knowledge?:

The UK Data Archive at the University of Essex is something of an authority on this topic, and had a useful webpage devoted to backing-up data.

The University of Glasgow has a set of useful webpages on data management support for researchers, including one dealing with data back-up.

Information about arrangements local to the University of Exeter can be found on the following Exeter IT web page: http://as.exeter.ac.uk/it/files/backup/

What else would I like to know about this topic:

I don’t begin to understand how the MD5 checksum works! I would like to find out more about this and perhaps see it in action.

How did I find this task? How would I improve it?

It was fine. There’s enough specific advice available on authoritative websites – much more than I could convey in this post.

Posted under Holistic Librarian, Training

This post was written by Diane Workman on January 24, 2013

Tags: ,

The Holistic Librarian – Thing 8

Hi, I’m Afzal, your Subject Librarian for Arabic, and Politics.

Task 8: A researcher asks you about her funder requirements on research data. Where you could find out this information?

What I knew about the topic beforehand:

First thing to occur would be to ask the funder.

What I know now:

An excellent overview of funder requirements is found here with a more detailed look in a document called Funder Requirements for Data Management and Sharing by Gareth Knight. He covers the following funders:

1. Action Medical Research (AMR)
2. Biotechnology and Biosciences Research Council (BBSRC)
3. Bill & Melinda Gates Foundation
4. Breast Cancer Campaign (BCC)
5. Cancer Research UK (CRUK)
6. Department of Health, UK (DoH)
7. Department for International Development (DfID)
8. Drugs for Neglected Diseases Initiative (DNDi)
9. Economic and Social Research Council (ESRC)
10. Engineering and Physical Sciences Research Council (EPSRC)
11. GlaxoSmithKline (GSK)
12. Medical Research Council (MRC)
13. Natural Environment Research Council (NERC)
14. Wellcome Trust
15. WHO – World Health Organization – TDR
16. World Cancer Research Fund (WCRF)
17. National Health Service Technology Assessment (NHS HTA)

How did you obtain this knowledge?

I used Google.

What else would you like to know about the topic?

Do any of the Funder’s policies represent a potential conflict with the University of Exeter’s policies; if so what are our strategies for resolving this?

How did you find this task? How would you improve it?

Straightforward. But why would a researcher – worthy of their name – ask me about their funder?!

Posted under Holistic Librarian, Training

This post was written by Afzal Hasan on January 24, 2013

Tags: ,

The Holistic Librarian – Thing 5

Hi, I’m Afzal, your Subject Librarian for Arabic, and Politics.

Task 5: What is our institutional policy on OA and RDM and how does it compare with other institutions’ policies? Are there any other institutional policies that affect research data management?

What I knew about the topic beforehand:

OA and RDM are major components at Russell Group Universities, and I assumed therefore that must be a university body that’s affiliated with a national organisation setting or coordinating national policy. I did not think there should be so much a ‘comparison’ as ‘coordination’.

What I know now:

Policy authorship at Exeter is the responsibility of  Open Access and Research Data Management Policy Task and Finish Group, formed in March 2012. I understand there’s one approved policy for PGR students and one for researchers in advance draft stage.  These can be read here.

Comparison of inter-institutional policies

The Digital Curation Centre http://www.dcc.ac.uk/ gives guidelines on policy formation. The DCC  has collated policies from various UK institutions on data management, and can be seen here:

http://www.dcc.ac.uk/resources/policy-and-legal/institutional-data-policies/uk-institutional-data-policies

There are policies from non-UK bodies as well.

How did you obtain this knowledge?

I ran a search for OA policies on the Exeter University homepage. I googled OA RDM policies to find the DCC page.

What else would you like to know about the topic?

Is there a coordinated national objective?

How did you find this task? How would you improve it?

I found it straightforward. There might be copyright issues/conflicts involved. To improve the task, I’d put the task more simply. It’s rather beyond the means to ask for a comparison of institutional policies across the world.

Posted under Holistic Librarian, Training

This post was written by Afzal Hasan on January 24, 2013

Tags: ,

The Holistic Librarian – Thing 12

Hi, I’m Diane Workman and I’m the Subject Librarian for Drama, English, Film Studies and Theology & Religion.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 12 was to research the answer to the question “What evidence can you cite that research made available on Open Access has more impact than research that is not available on Open Access?”

What I knew about the topic beforehand:

I attended Alma Swan’s presentation during Open Access (OA) Week in October, and remembered her citing some examples where placing research papers on OA had a positive impact on their use and/or citation. The concept of Journal Impact Factor (JIF) is one that takes me out of my comfort zone, as it has traditionally been of less importance to the Humanities disciplines that I provide library support for.

What I know now:

Since 2001 many studies have taken place, largely in the Sciences, to test the hypothesis that OA provides a citation advantage “by increasing visibility, findability and accessibility for research articles” (Swan, 2010). Swan summarizes 31 studies that were published between 2001 and 2010, concluding that 27 studies found a positive citation advantage to placing a research article on OA. She also provides data on the size of the OA citation advantage found (as a % increase in citations) by discipline.

There is a move away from reliance on that traditional citation metric the JIF in measuring the impact of research articles. Notice is also being taken of Social Media-based metrics and ‘Altmetrics’ in attempting to determine the impact of research made available on OA.

How did I obtain this knowledge?:

Swan’s report The Open Access citation advantage: Studies and results to date provides a useful summary of the relevant studies carried out up to 2010.

There is a useful web-based bibliography provided the OpCit Project, which you can use to bring your knowledge up to date. This provides abstracts for a variety of studies, including a number published as recently as 2012.

What else would I like to know about this topic:

I would like to have more information about the impact of OA on research in HASS disciplines, as so many of the published studies focus upon STEMM subjects, where OA is more established. There is an interesting article by Melissa Terras, published in the OA Journal of Digital Humanities called The impact of social media on the dissemination of research: results of an experiment which deals with the topic from a HASS perspective.

How did I find this task? How would I improve it?

I enjoyed finding out more about this topic. It was well-documented in sources available on OA, I’m pleased to say!

Posted under Holistic Librarian, Training

This post was written by Diane Workman on January 23, 2013

Tags: ,

The Holistic Librarian – Thing 21

Hello. I’m Caroline Huxtable, the Subject Librarian for Computer Science, Engineering, Mathematics, Medical Imaging and Physics.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 21 was to research the answer to the question: Which criteria could a researcher use to select which research data he needs to preserve in the long-term?

What I knew about the topic before the task:

I knew that it is important for researchers to preserve their data long-term for potential future use by the wider research community, in order to contribute to the advancement of knowledge. I also knew that increasingly it is a requirement of funding bodies that data arising from publicly-funded research is preserved and made openly available  as a public good, with ‘as few restrictions as possible in a timely and responsible manner that does not harm intellectual property’ (Source of quote: RCUK Common Principles on Data Policy). Additionally, I was aware that it is vital to select which data should/can be preserved, as impractical to preserve and provide access to all data, not least because of cost implications.

What I know now:

It is good practice, and increasingly a requirement of funders, that – at the outset of a project or even at the grant application stage – researchers create and implement a Data Management Plan, which typically includes information on ‘what data will be created and how, and outlines the plans for sharing and preservation, noting what is appropriate given the nature of the data and any restrictions that may need to be applied’. [Source: Digital Curation Centre (DCC) website; section on Data Management Plans].

The plan should specify, for example:

  • which data will be preserved.
  • whether any of the data will be deleted prior to archiving (see Things 19 and 20 for further comments on this issue).
  • which type(s) of data will be preserved (raw data, derived data, samples etc.).

As well as stating the above in the Data Management Plan, the appraisal and selection of data for preservation will be an ongoing, iterative process as the research from which it derives progresses.

The criteria which the researcher could use to select which data to preserve long-term include:

  • whether the confidentiality and/or sensitivity of any of the data means that it cannot be archived, or can only be preserved in a dark archive.
  • any funding body or other legal requirements on which data must be archived and/or made accessible.
  • whether any non-disclosure agreements apply to any of the data.
  • does the data fit into the archive or repository’s selection policy?
  • the likely costs of preservation.
  • how significant is the data for research and/or scientific or social progress?
  • is the data unique?
  • is the data usable by others?
  • the volume of data and the available storage capabilities.
  • copyright and other legal rights pertaining to the data and its ability to be preserved and/or made accessible.
  • does the data constitute the ‘vital records’ of a project, and therefore need to be retained indefinitely?
  • technical issues, e.g. can the file format be maintained or transferred for future use?
  • is there adequate metadata to describe the data and ensure its discoverability?

How I obtained this knowledge:

What else I would like to know about the topic:

Much of the existing information that I discovered seems to be written for use by repository managers, librarians, data curators etc. to assist in the creation of data appraisal, selection and curation policies. So it would be beneficial to have available a checklist of criteria from the researcher’s perspective, to which I could point researchers when answering enquiries.

How I found the task and how I would improve it:

I found the task interesting and somewhat easier to research than Things 19 and 20, perhaps because there seems to have been more written and collected together than for those earlier tasks.

Posted under Holistic Librarian, Training

This post was written by Caroline Huxtable on January 7, 2013

The Holistic Librarian – Thing 20

Hello. I’m Caroline Huxtable, the Subject Librarian for Computer Science, Engineering, Mathematics, Medical Imaging and Physics.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 20 was to research the answer to the question: A researcher wants to archive sensitive research data securely for long-term preservation. What options does she have?

What I knew about the topic before the task:

I was aware that research outputs may contain sensitive data that must be securely stored and preserved. I understood sensitive data to be:

  • personal data relating to individuals.
  • data which allows individuals to be identified.
  • confidential data.
  • data which is considered commercially sensitive such as trade secrets.

What I know now:

The UK Data Protection Act 1998 defines sensitive personal data as relating to matters including racial or ethnic origin, political and/or religious beliefs, and physical or mental health. The Act applies only to personal or sensitive personal data, and not to all research data in general, nor to anonymised data. I found it difficult to find any satisfactory definition of sensitive data more generally (rather than specifically personal data).

Sensitive data may exist in both digital and non-digital formats.

Issues to consider in relation to the secure archiving of sensitive research data include:

  • Physical security of the data, such as controlling access to rooms and buildings where data is stored and transporting sensitive (non-digital) data only when absolutely necessary.
  • Network security, e.g. not storing sensitive data on servers or computers connected to an external network, and ensuring that firewalls and other security protection are in place and kept updated.
  • Security of computer systems and files, including:
    • password protection, using complex passwords and changing them regularly.
    • implementing administrator-only permissions to access some or all of the data as appropriate (the fine detail of these permissions would need to be considered in the light of how many individuals are working with the data, and whether each person requires access to all the data or just sub-sets thereof).
    • encryption of files.
    • imposition of non-disclosure agreements for users of confidential data.
    • not exchanging sensitive data via the cloud or email unless it has been encrypted.
    • ensuring that secure measure are used when data is destroyed or devices that hold or have accessed such data are disposed of.
  • Data containing personal information is governed by the Data Protection Act, which allows personal data to be accessible only to authorised persons. It must be treated with higher security than data that does not contain personal information. This can be achieved by, for example, anonymising or aggregating data pertaining to individuals, or by storing personal information separately from the rest of the related data.
  • Whether certain data should be permanently deleted from the archived datasets so as to avoid accidental discoverability.
  • Whether data should be archived in a dark archive, i.e. preserved long-term, but not publicly discoverable or accessible.

How I obtained this knowledge:

I primarily consulted the website of the UK Data Archive, in particular the section on data security. I also got some tips from a presentation given as part of the Open Exeter project by Caroline Dominey, the University Records Manager, on data protection, storage and sharing. Additionally I looked at the University’s Information Security Policy.

What else I would like to know about the topic:

I would like to have a greater understanding of what constitutes sensitive data. As stated above, I struggled to find a good definition, and am therefore unsure whether my answer is as comprehensive as it could be in relation to the secure preservation of such data. I would welcome expert guidance on where to find further information; a training session would be ideal, as I learn better in such an environment rather than via the self-directed learning method.

How I found the task and how I would improve it:

I found the question ‘What options does she have’? slightly ambiguous in meaning. I took it to mean (and answered) ‘what issues does she need to consider in relation to the security of sensitive data’?

Posted under Holistic Librarian, Training

This post was written by Caroline Huxtable on January 7, 2013

The Holistic Librarian – Thing 19

Hello. I’m Caroline Huxtable, the Subject Librarian for Computer Science, Engineering, Mathematics, Medical Imaging and Physics.

As part of the Holistic Librarian project I was asked to research three tasks of the ‘23 Things (+1) for Research Data Management’.

Task 19 was to research the answer to the question: A researcher is working with a commercial partner on a research project. In which circumstances could the researcher make the research data from this project available on Open Access?

What I knew about the topic before the task:

I was aware that researchers in the College of Engineering, Mathematics and Physical Sciences (whose subjects I support) collaborate on projects with a range of regional, national and international external organisations including multinational companies, and that this work is often therefore commercially sensitive. However, I have not had to deal with any queries about the use of research data arising from such work, so this was a new topic for me.

What I know now:

The researcher must ensure that s/he abides by the conditions of any agreements entered into with the commercial partner, including in respect of the use of research data arising from the joint research project, such as whether it can be made available on Open Access (for example in a subject or institutional repository). The researcher must also ensure that the content of any Open Access data does not infringe copyright, e.g. that it is not derived from a licensed or commercial product. If the data does contain copyrighted material, the researcher must ensure that permission has been sought from and granted by the rights holder to include it in the Open Access dataset. Any material for which such permissions have not been granted must be deleted from the dataset prior to it being made Open Access. If the dataset has been sponsored or funded by any organisation other than the researcher’s employer, the researcher must ensure that s/he has fulfilled all obligations to that institution or organisation regarding Open Access publication.

Additionally, it is good practice, and increasingly a requirement of funders, that – at the outset of a project or even at the grant application stage- researchers create and implement a Data Management Plan, which typically includes information on ‘what data will be created and how, and outlines the plans for sharing and preservation, noting what is appropriate given the nature of the data and any restrictions that may need to be applied’. [Source: Digital Curation Centre (DCC) website; section on Data Management Plans]. Such a plan will need to consider whether there are ethical, privacy or commercial issues which may prohibit making some or all of the data publicly available on Open Access. Any restrictions on access to any of the data should be justified in the plan, for example due to the terms of a commercial partnership agreement, which may include a non-disclosure agreement or an expectation that the data will be exploited commercially or has the potential to be patented.

These are the considerations that a researcher must take into account when deciding whether such research data could be made Open Access.

How I obtained this knowledge:

The Digital Curation Centre website contains some useful information. For example, I looked at its document ‘Policy-making for Research Data in Repositories: a Guide’, and also consulted the section on Data Management Plans, e.g. the ‘Checklist for a Data Management Plan’. I also consulted the University of Exeter’s Research and Knowledge Transfer webpages, in particular the ‘Intellectual property and commercialisation’ section of their Research Toolkit.

What else I would like to know about the topic:

I feel that I have barely scratched the surface of this topic, and would like to know more. I do not feel confident that I have got a clear understanding of the subject, nor that I would be able to help a researcher who asked me this question. I would refer a query on this subject to a member of the Open Exeter team, or to Research and Knowledge Transfer.

I would welcome expert guidance on where to find further information; a training session would be ideal, as I learn better in such an environment rather than via the self-directed learning method.

How I found the task and how I would improve it:

I found this task very difficult to research, as it is not an area for which I had any prior knowledge, nor have I had any enquiries from researchers about it.

It would have been much more helpful to have had a list of links to relevant resources to refer to as I performed the task. I really needed to be pointed in the right direction, at least to get me started; I don’t learn well when faced with a bare question with no context or background.

Posted under Holistic Librarian, Training

This post was written by Caroline Huxtable on January 4, 2013