Open Exeter Research Data survey is launched

After what seems like hundreds of meetings, multiple drafts and a pilot survey with our PGR students, the Open Exeter Research Data Survey has been launched!

The survey is open  to researchers at Exeter from postgraduate research (PGR) level to senior professors. We are hoping for as many responses as possible although due to the nature of the beast I will be happy with a response rate of around 10%.

We are well aware of the demands that researchers face on their time and we have tried to make the survey as user friendly and easy to fill in as possible. This has been helped in no small measure by the ease of use of Bristol Online Surveys.

Once we have some results we will be in a far better position to understand the needs and requirements of researchers at Exeter.

For those of you reading this who are Exeter researchers the survey can be found online at:

Open Exeter – the benefits!

We have been thinking about the key benefits that Open Exeter will generate during the project’s lifetime and continue to produce when the project finishes in March 2013. The benefits are different for different stakeholders but we will highlight the benefits for individual researchers and research groups and for the University of Exeter as an institution as we will concentrate on these in the project’s advocacy plan. The evidence for these benefits will range from concrete evidence (i.e. the existence of a one stop-shop webpage), to quantifiable evidence (i.e. number of data deposits in the Exeter Data Archive (EDA)) to less measurable indications of the impact of the project (e.g. changes in research culture; that researchers are aware of the concept of good practice in research data management (RDM)). One way to evaluate differences in behaviour would be to repeat some of the exercises we are beginning now towards the end of the project (or after it has finished). We are surveying researchers and PhD students in a number of ways in order to gain knowledge of existing practice in and attitudes towards RDM – we launch our online survey next week, have our first in-depth interview with a researcher tomorrow and will conduct case studies in the near future. By the end of the project we hope to be able to show at least some shift in behaviour and attitudes, although obtaining indications of a widespread cultural change is bound to be a longer process.

These are the three principal benefits that we have identified at this point in the project:

Clarity about and awareness of good practice in RDM across Exeter’s research community.

The evidence for improvement in this area will include:

  • Researcher’s compliance with funder’s policies on RDM
  • A ratified institutional policy on good practice in RDM in which roles and responsibilities are defined
  • A one stop-shop webpage with clear information and guidance about RDM
  • Discipline specific data management training programmes
  • Improved data management skills of researchers and support staff measured by their uptake of training programmes and number and nature of queries about RDM.

Increased visibility and reuse of University of Exeter research

We hope to see the following evidence of increased discoverability and reuse of University of Exeter research:

  • The existing pilot DSpace data repository will be extended across the University to all disciplines
  • Increase in the number of datasets deposited in EDA
  • Increase in number of views of University of Exeter research data and research outputs
  • Increase in number of citations of University of Exeter research data and publications
  • Increased research funding as a result of good practice in RDM

A sustainable long-term plan for University RDM services

Our evidence will include:

  • A medium and long term costed plan for Exeter’s RDM services based on researchers’ input and feedback as well as anticipating changes in funders’ policies and international standards for RDM
  • Training and guidance embedded into existing training modules and services
  • Information and advice about RDM flagged to researchers at relevant points in the research lifecycle

Preliminary Findings from the First Open Exeter PGR Workshop

We’re putting together a report outlining findings in a more readable form but in the meantime we thought others might be interested in what came out of our first workshop with PGRs back in December:

There was a basic difference between the way PGRS from Science-based vs Humanities/Social Sciences subjects perceived data and the descriptions commonly applied to it.

In the experience of the more Science-based PGRs the people they work alongside don’t use the terms ‘primary’ and ‘secondary data’ – even the word ‘data’ is problematic for some.

Scientists agreed that ‘primary’ data would relate to data that you’ve created yourself, secondary already exists.

Humanities folk thought primary data was something you might look at in an archive.

In some cases you’re studying someone else’s data, a film, for example – it’s your research data but you don’t own it.

What you do to data, such as editing or manipulation, is your own original contribution/ your intellectual input.

Is there confusion between data and sources?  These have specific meanings.

A source is something you haven’t created yourself but if you take it and do something to it yourself, does it then become your data?

All agreed on the following terms to distinguish between data ‘stages’:

  • Raw data
  • Processed or analysed data
  • Published data.

Published data was the form over which there was less disagreement.

Talking about data in terms of its stage in the process or methodology seemed to be more useful and to gather a general agreement than trying to define ‘types’ of data.

Data is different according to what is done to it rather than where it comes from or how it’s created.

Data tends to ‘become’ yours when you’ve done something to it, although questions were raised here about whether this is now secondary data?

For Sciences PGRs post-processing of raw data is what makes it meaningful.

All agreed that data on its own without context or information on methodology or process is unusable.

There is little understanding of the term metadata in the way that it would normally be used in the Library/Information community.

Scientists understood the term metadata to mean meta-analysis, others hadn’t come across it at all.  However, all had come across the concept of keywords, such as used when searching Google or an online database, for example.

At this point many of the students realised they did use metadata but had different terms for it: descriptive data, contextual data, etc.

One student adds metadata manually to the Properties element of a document.

Some students saw bibliographies as metadata.

No-one had heard of RDM apart from two students who had come across the concept when working in industry, particularly in the context of using sensitive data, which was highly monitored.

There was a general feeling that at the University you are left on your own to discover things and work out your own systems.

Many of the students had problems organising files, different versions of the same document sometimes got confused – this is particularly the case when students work on research in more than one place and then try to synch files.

None had come across the 8.3 standard for naming files.  One of the students showed particular interest in this as some of his data had become corrupted in the past; he thought it may be because he used very long and complex file names.

All were interested in what tools are available to help them manage their sources.  EndNote is very much promoted by the University but training is hard to get – group sessions are often booked up.  Training is available when you first start but that’s not really the time you need it – you get a lot of sessions in the first weeks and it’s too much to take in.

All were interested in learning from each other about methods and tools for handling bibliographic data – this is something they all have in common regardless of discipline.

They had all received different levels of help, training and advice depending on their subject and department but nothing on RDM.

All found it difficult to know where to look for help on specific topics – the University web site is confusing.  None of them realised that there are a variety of information and research skills guides produced by the Library and freely available in ELE (Exeter’s VLE).

All thought an essential ‘survival’ starter guide for PGRs would be really useful – everything you need to know accessible from one place and always there when you need it.

All had learnt what they needed to know to survive from peers and they liked to learn this way.  We suggested trying to formalise and embed this way of learning and teaching – comments were that it would have to be made worthwhile.

If RDM help and guidance is put in one place on the web, projects and research groups can pick and mix relevant documents to make up their own RDM policy, to keep it in offices or labs where everyone can access it.

Opinion on how to obtain training was mixed – most prefer face to face, hands-on training with video as a back-up – it’s there when you need it.  Training that is specific to their subject area is most useful.  It should be departmentally based although there are some generic elements that could be delivered by the Library.

MRD South West Regional Meeting

Gareth, Hannah and I attended the MRD south west projects meet up last week at Bath.  Thanks to colleagues at Bath (especially Jez Cope) for arranging and hosting the meeting.  It was a useful opportunity to meet other colleagues and explore areas for potential collaboration.  Below are some brief notes/thoughts that we took away from the meeting.  We hope to follow this up at Exeter in a couple of months’ time when we will all have a bit more to share:

  • Bath and Bristol both looking at connecting backup/VRE with repository for easy deposit from integrated environment; Bath looking at using SWORD; Bristol building their repository on top of their RDSF data storage system; Exeter have separate backup and repository systems and are looking at how these can be joined up, also looking at SWORD.  Bath and UWE using Eprints, Bristol using Drupal, Exeter DSpace.
  • All looking at how repository/data deposit can be integrated with CRIS – PURE at Bristol and Bath, Symplectic at Exeter.
  • Methodologies for investigating data management: Exeter is using an adapted version of DAF; Bath used DAF last year; UWE will also be using a questionnaire method; Bath and Bristol are looking at the use of CARDIO further into the project.
  • Looking at other methods of capturing information about data use, we wondered where data management queries are currently going?  Do they mainly go through IT and is there an accessible record of them?
  • We talked about capacity for dealing with DMP queries for the long term, part of sustainability planning; there’s been a lot of focus on training for PGRs but at what point should we start focussing on training for librarians and support staff?
  • All felt that the Library should be the eventual home for RDM advice and curation service (but possibly RKT or equivalent at DMP/bid writing stage).
  • Build DMP awareness into mandatory stage of funding bid process, i.e., financial check.  This seems to be the only compulsory element of bid submission.
  • Discussion around when to start PGR training – when they start is not the time to overload them with info.  Basic ‘survival’ guidance could be followed up at various points in their research with more specific training.
  • Policy: the Edinburgh research data management policy is a great starting point but may need updating to take account of more recent developments (e.g., EPSRC policy)?
  • Generic principles are needed at institutional level (issues of ratification) these can be adapted for more detailed discipline-specific use – this would be easier to update, possibly annually – issues here regarding who has responsibility for managing updates.  There needs to be a clear distinction between policy and procedure.
  • Any institutional policy developed should take care not to deter commercial/industrial partners (clear opt out of Open Access) and we need to be aware of collaboration issues.
  • How to enforce policy? Initially just guidelines, not heavily monitored as culture and practice hopefully changes over time?  The possibility of funding councils imposing sanctions for non-compliance (i.e., deposit on Open Access) is a useful stick (also a bit of a carrot).  It was mentioned that funders are starting to turn down proposals that contain weak DMPs/Technical Appendices.
  • Advocacy: what are the benefits for researchers? Secure storage; possibility of citing data for REF; links with subject specific data repositories; not depositing data twice in different repositories.
  • Possibility of future use of data repository stats for REF tracking/institutional benchmarking.

