New dataset in our archive

Like many projects we’re still trying out various prototypes for the transfer and deposit of big data (GBs and TBs) – Ian has blogged a bit about this.

However, many of Exeter’s key datasets are not that large and I’ve just deposited one of these in our pilot data repository:

This is the Cogeme Phytopathogenic Fungi and Oomycete EST Database developed by Dr Darren Soanes, a member of Professor Nick Talbot’s Molecular Genetics group.

This is freely available under a CC-BY licence.

I’d be interested to see what people think about the metadata. Let us know if you have any thoughts or suggestions.

Posted under Exeter Data Archive, Metadata

This post was written by Jill Evans on September 21, 2012

Tags: , ,

PGR feedback on data upload

Last week we asked our group of PGRs to test upload of data to the Exeter Data Archive. I was particularly interested in seeing how they would respond to the interface and the metadata web form.

The following are some of the comments that we received – some of these relate specifically to how DSpace works but some are of general interest:

• Add a sentence to the current licence making it clear that depositors can ask to remove their data/outputs.

• It’s important to be able to see inside a zip file.

• How can multiple files be uploaded?

• It would be used more if it were possible to upload from your own drive – drag and drop rather than entering metadata through the web interface.

• A ‘wizard’ like process would be really helpful.

• Would like a template structure for storing previously entered metadata, this could be selected later for further related deposits.

• Keywords – need intuitive text to appear in boxes otherwise will get an inconsistent and inaccurate list of keywords.

• Upload speed – varied between PGRs, Mac users found it much quicker – 100mb audio file uploaded in about 30 seconds; 700mb took 20 mins to upload with a Mac.

• The Submit button needs to be much clearer

• Do you need to login before you upload or could you choose to upload and then have to login – which is better?

• Metadata – people will cut corners if it’s too onerous.

• Would be good to be able to add projects to the hierarchy (i.e., DSpace Communities structure)

• DPA – is it contravening DPA if even an administrator can see sensitive data?

• Data could be encrypted as well as being stored in a ‘dark archive’.

• An upload manager would be a really useful feature – you could queue files for upload and then just leave them.

• Important to add contact details of depositor (PI, etc.), especially email address.

• Clearer help and guidance; make mandatory fields clearer.  Title – more specific guidance, is this title of the deposit or depositor.

• Would be useful to have a dropdown list of your previous submissions, you could then choose to link things together (e.g., paper & data), and make the process easier.

• Confused about the difference between date of publication and date of creation – publication is date it becomes publicly available and is need by DataCite – but DSpace doesn’t automatically assign this detail to the ‘publication’ field.

• Need a more comprehensive list of data types than default Dublin Core list.

Posted under Big Data, Metadata, Technical development

This post was written by Jill Evans on May 31, 2012

Tags: , ,