Case study – The Cricket-Tracking Project

Other JISC MRD projects or those working with ‘big data’ may be interested in a case study that has been written for Open Exeter by Dr Jacq Christmas (

The case study documents the process of reviewing, preparing, uploading and describing multiple large video files. The project that generated the files is investigating the behaviour of crickets through analysis of thousands of hours of motion-triggered video.

The project is interesting to us for a number of reasons:

• It is a cross-disciplinary/cross-departmental project – these sort of projects are becoming increasingly common at Exeter and do throw up interesting questions around the area of ‘ownership’
• Huge amounts of data have been and continue to be produced
• Storage is a problem due to the number and size of files – most files are stored on external hard drives held in various places
• As there is no central storage system, secure backup can be a problem
• Ditto secure sharing
• The first batch of video is in a proprietary format that requires specific software in order to be viewable

The case study sets out quite clearly the thought that should be given to selecting and preparing files for upload to a repository. We are looking at how the procedures described can be adapted as templates to guide researchers from other disciplines through the deposit process, some aspects of which will always be generic, for example:

• Listing and explaining the various file formats and how they are related
• Selecting a set of metadata fields to describe the files
• Thinking about the structure of the data in the repository and how it links to related resources, projects and collections

One issue that has arisen from this case study, that we were already well aware of, is the preference to deposit research in a project or research group collection rather than a generic departmental or College collection. In many cases the sense of belonging to or affinity with a group is stronger than departmental ties. This is a tricky one for us: DSpace structure centres on a hierarchy of communities, sub-communities and collections; once these have been set up and start to be populated, it is difficult to make significant changes. Add to that the fact that our CRIS, Symplectic, has been painstakingly mapped across to all our existing communities and collections and any structural changes become even more problematic. For the moment we are looking at a possible metadata solution (dc****.research group ??). I’d be interested to hear how others deal with the research project/group requirement.

We’re about to start a similar test case study with Astrophysics and later in the year with an AHRC-funded project based in Classics and Ancient History. It will be interesting to see if the approach taken in these areas are significantly different, or given different emphasis.

I won’t say that our first case study has allowed us to resolve the many issues raised yet but we are at least more aware of what is important to researchers and can start to take steps to find solutions.

Posted under Big Data, Case studies

This post was written by Jill Evans on May 28, 2012

Tags: , , ,