Top Tips for Developing Research Group-level RDM Policy

The Open Exeter and Marine Renewable Energy policy case study, published today, suggests some tips for other research groups who are thinking about designing their own research data management policies. The recommendations are as follows:

  • Research group level policy development should be collaborative and include consultation with all members of the research group as far as possible. Feedback from the research community should be listened to; participation in policy development can give researchers a sense of ownership and make the policy implementation phase easier.
  • It can be helpful to separate out the principles of a policy from the nitty-gritty of procedures; thus those who don’t wish to read a longer, more detailed document can understand the main points quickly and refer to the procedural document only when necessary.
  • Local research data management policies should be updated to reflect changes in institutional, funder and ethical, legal and commercial guidelines and these should be considered during policy development.
  • Consider institutional as well as local and discipline-specific solutions. For example, if your institution provides a data repository, would it be better to use this for the long-term storage of data, rather than local storage or should data sets be stored in a discipline-specific repository?
  • Decide on the scope of the policy; different research groups have different priorities – for example, a Psychology-based group would probably be more concerned with ethical and legal issues to do with working with human participants. It may be worth concentrating first on priority areas and rolling out a more comprehensive policy at a later date.
  • Try to balance the amount of detail in the procedural document with respecting researchers’ working habits. For example, is it necessary for all researchers to use the same system to name files?
  • Work out an estimated timetable for policy and procedure development but be flexible to reflect changing circumstances if necessary.
  • Consider the relationship between guidelines for individual projects and research group policy.
  • Tailor RDM policy and procedures to the support available to your research group. For example, a group with a dedicated Computing Development Officer may be able to put into place more bespoke solutions than a group without this support.
  • Listen to researchers’ concerns and make sure they are clearly addressed in the policy and procedural documents.
  • Provide support for the initial transition. Staff may not have time to do tasks such as consolidate and transfer old data sets to a central storage system, as they are busy with current and future work, and rarely have the time to look backwards.

Have you developed a research-group level RDM policy? Do you agree with these recommendations or have any of your own suggestions? Let us know!

Posted under Case studies, News, Policy, Research

This post was written by Hannah Lloyd-Jones on July 26, 2013

Tags: , , , ,

Marine Renewable Energy Policy Case Study Published

We are pleased to announce that we have published a case study on developing research data management policy at research group level. The report, which was co-written by the Open Exeter project and the Marine Renewable Energy Group, is available in ORE and looks at how RDM policy and procedures were developed and implemented by the Group.

Marine Renewable Energy research on the University of Exeter’s Penryn campus is led by Dr Lars Johanning and is part of the College of Engineering, Mathematics, and Physical Sciences (CEMPS). The group decided to develop a research data management policy to ensure that the data it uses are secure, will be reusable in the future and can be shared easily amongst collaborators. The policy work was accompanied by a review of the way in which the group store data. This work has been supported through the Bridging the Gaps initiative and led by Dr Ian Ashton in conjunction with others in the research group.

Read the case study here – comments welcome!

Posted under Case studies, News, Policy, Reports

This post was written by Hannah Lloyd-Jones on July 26, 2013

Tags: , , , ,

Advocacy and the DAF – new Open Exeter case study

Readers may be interested in a new case study describing how the Open Exeter team promoted our online DAF survey. The survey attracted 284 responses, a very respectable response rate which we feel is a result of the advocacy and promotional work carried out before and during the survey. You can read or download the survey from our repository: http://hdl.handle.net/10036/3754

Posted under Advocacy and Governance, Case studies, Online survey

This post was written by Jill Evans on September 26, 2012

Tags: , ,

Collecting data that captures human emotions

Another case study by an Exeter PGR is now available: http://hdl.handle.net/10036/3697

This report, by Mrunal Chavda, a Drama PhD student, presents some fairly unusual data management challenges.

Basically, Mrunal is attempting to capture and document emotional responses to dramatic situations using human subjects.

This immediately throws up obvious issues around ethics, confidentiality and the correct use and storage of information covered by the Data Protection Act.

Given the unusual nature of the study there are some unique challenges around data collection – how to ensure the emotions captured are unselfconscious and genuine, and what technologies can be used. Devices must be reliable and robust but also as unobtrusive as possible.

All this is carried out with the purpose of developing a new analytical model based on Rasa aesthetics.

Interesting reading!

For a basic definition of Rasa:
http://m.eb.com/topic/491635

URI: http://hdl.handle.net/10036/3697

Posted under Case studies

This post was written by Jill Evans on August 20, 2012

Tags: , ,

The pitfalls of using copyrighted materials in theses

Some of you may be interested in a short case study written by one of our PGRs, Duncan Wright of Archaeology, on trying to deal retrospectively with the issue of obtaining permission to use copyrighted visual material in his thesis: 

http://hdl.handle.net/10036/3690

Like many PhD students, Duncan only be came aware of copyright restrictions on reuse of materials towards the end of his research and now has the problem of trying to negotiate permission with multiple copyright holders for content that has become an integral part of his thesis.

He has the option of removing the offending material and submitting a redacted version to our repository (thesis deposit is mandatory) but clearly this will have an impact on how the intellectual value of the study is perceived and is therefore not ideal.

It’s short enough to give to new or 2nd year PGRs as a warning, so please feel free to reuse!

Posted under Case studies, Copyright, Research

This post was written by Jill Evans on August 13, 2012

Tags: , ,

Open Exeter and Marine Renewable Energy Group Policy Case Study

The Open Exeter project will be working together with marine renewable energy researchers to develop a research group-level policy on research data management.

Marine Renewable Energy research on the Tremough campus is led by Dr Lars Johanning and involves approximately 20 staff, including collaborators in biosciences and the ESI. Current research interests include hydrodynamics and marine operations, resource assessment, marine policy, offshore reliability and environmental impacts of offshore renewable energy.

The Group has decided to develop a group-level research data management policy to ensure that the data it uses are secure, will be reusable in the future and can be shared easily amongst collaborators. This work is supported through the Bridging the Gaps initiative.

The Open Exeter team has set up a Task and Finish Group which is currently working on the draft version of an institutional-level research data management and Open Access policy. The Task and Finish group will make recommendations on how this high-level policy can be tailored for more specific disciplines and research groups and as well as how the policy can be implemented on a procedural-level.

The outputs of the collaboration between Open Exeter and the Marine Renewable Energy Group will include a written case study which will document the process of developing a research group-level policy on research data management.

Members of the Group will be writing about the process of policy-creation on the Open Exeter blog so check back regularly for more information about the case study.

Posted under Advocacy and Governance, Case studies, News, Research

This post was written by Hannah Lloyd-Jones on June 6, 2012

Zen Archiving: an Open Exeter Case Study in Astrophysics

Posting this on behalf of Tom Haworth. Tom is a 2nd year Postgraduate in Astrophysics and has been commissioned by us to write a case study documenting the process of transferring large amounts of data (TBs) from a HPC (zen) to the Exeter Data Archive.

We are interested in the whole process – from deciding what to keep and what to delete to data bundling and metadata entry. The Astrophysics Group is using the process to develop policy and guidelines on use of zen to store and manage data.

The following are some initial thoughts on how to kick off the process:

Zen Archiving: an Open Exeter Case Study in Astrophysics

Summary:

– The archiving process will have to take place from the command line (or a gui) on zen-viz.
– Tom Haworth will develop a script that takes user-entered metadata, potentially compresses the file, and sends both directly to the archiving server.
– The Open Exeter IT team has sufficient information to perform the archiving server-end work. They are also considering command line retrieval of data.
– The kind of data that we expect to archive is completed models. Necessary software to view the data should be included too.
– Email and WIKI entries are all that will be required for training.

Where is the data
Data will be stored on zen at one of /archive/, /scratch/ or/data/. archive and scratch are not under warranty.

What kind of data needs to be archived
There will be a range of data of different file formats, some not seen outside of the astrophysics community. These can be collected and compressed, if not by the user then potentially by the submission script at run-time. Compression is not always worth doing so a list of compression-worthy extensions could be stored.

The data to archive will probably be on a model-by-model basis rather than publication, but publication details will be included in the metadata. This will probably be governed by the size of the files.

Data to be archived should be completed models.

What will happen to the data on zen
This will probably be determined on a case-by-case basis depending on how frequently (if at all) the data is required. Data that has no imminent further use should be removed.

For example, I would be archiving some finished models but may also need them for my thesis.

How might extraction from the archive work from the command line?
– searching could still take place on the web
– extraction would rely on direct communication with the archiving server

Policy for archiving
Should avoid letting any user on zen archive absolutely anything and everything. Need:
 guidelines on what should be archived
 We can track how much people have been archiving and communicate with them if it looks like they are abusing it.

Metadata verification for senior users is not required. PhD students could have their submission metadata verified by their supervisor.

Metadata
Metadata is required to ensure that the data is properly referenced and can be found easily.
Entries are Title, Author, Publisher, Date Issued, URL, Abstract, Keywords, Type etc.

In HPC astrophysics there will likely be additional entries of use such as the code used to generate the data. I suggest using an “Additional Comments” field.

This information will be requested at the command line when archiving.

The archiving procedure on zen
It will be completely impractical to archive the data through the web interface. It will also be impractical to download the data onto a local machine and then archive it (local machines probably will not even have the capacity to store the data). The ideal situation will be one in which data can be archived straight from zen, communicating directly with the storage server and sending the appropriate metadata in addition.

This should happen from the zen visualization node, so as not to grind the login node to a halt.

A simple command line script would be all that is required.

Basic archive script
Read in name of thing to archive
Check the size of the thing to archive
Communicate with the archiving server to check if the quota will be exceeded
If quota not exceeded
Get metadata from user (some could be stored in a .config file for each user)
Check if the file extension is in the list of those that are worth compressing
Compress if worthwhile
Copy metadata and dataToArchive across to the archiving server
Else
Tell the user to contact the person responsible for updating quota sizes.
End

A gui version could also be implemented if desired, but would definitely not be necessary for zen.

At present Tom Haworth is going to develop this script and test the procedure on existing data. Pete Leggett of Open Exeter will develop the server end stuff.

Training

For zen users, essentially no training will be required. An email to the zen mailing list telling them what they need to do is standard procedure. They can also contact the zen manager if they have trouble. Can also add a section to the zen component of the astrophysics WIKI so that there is some permanent documentation.

Posted under Big Data, Case studies

This post was written by Jill Evans on May 31, 2012

Tags: , , ,

Case study – The Cricket-Tracking Project

Other JISC MRD projects or those working with ‘big data’ may be interested in a case study that has been written for Open Exeter by Dr Jacq Christmas (http://hdl.handle.net/10036/3556).

The case study documents the process of reviewing, preparing, uploading and describing multiple large video files. The project that generated the files is investigating the behaviour of crickets through analysis of thousands of hours of motion-triggered video.

The project is interesting to us for a number of reasons:

• It is a cross-disciplinary/cross-departmental project – these sort of projects are becoming increasingly common at Exeter and do throw up interesting questions around the area of ‘ownership’
• Huge amounts of data have been and continue to be produced
• Storage is a problem due to the number and size of files – most files are stored on external hard drives held in various places
• As there is no central storage system, secure backup can be a problem
• Ditto secure sharing
• The first batch of video is in a proprietary format that requires specific software in order to be viewable

The case study sets out quite clearly the thought that should be given to selecting and preparing files for upload to a repository. We are looking at how the procedures described can be adapted as templates to guide researchers from other disciplines through the deposit process, some aspects of which will always be generic, for example:

• Listing and explaining the various file formats and how they are related
• Selecting a set of metadata fields to describe the files
• Thinking about the structure of the data in the repository and how it links to related resources, projects and collections

One issue that has arisen from this case study, that we were already well aware of, is the preference to deposit research in a project or research group collection rather than a generic departmental or College collection. In many cases the sense of belonging to or affinity with a group is stronger than departmental ties. This is a tricky one for us: DSpace structure centres on a hierarchy of communities, sub-communities and collections; once these have been set up and start to be populated, it is difficult to make significant changes. Add to that the fact that our CRIS, Symplectic, has been painstakingly mapped across to all our existing communities and collections and any structural changes become even more problematic. For the moment we are looking at a possible metadata solution (dc****.research group ??). I’d be interested to hear how others deal with the research project/group requirement.

We’re about to start a similar test case study with Astrophysics and later in the year with an AHRC-funded project based in Classics and Ancient History. It will be interesting to see if the approach taken in these areas are significantly different, or given different emphasis.

I won’t say that our first case study has allowed us to resolve the many issues raised yet but we are at least more aware of what is important to researchers and can start to take steps to find solutions.

Posted under Big Data, Case studies

This post was written by Jill Evans on May 28, 2012

Tags: , , ,