Technical Update

In the technical part of the project, now we have upgraded our live and test Dspace installations to the latest version (1.8.2) and switched to Oracle 11g (our previous version of Postgres needed to be upgraded and our support at the Univeristy is much better for Oracle), we’ve been able to move onto looking at developing a submission tool that can cope with some quite heavy research data loads.

The most important scenarios from our perspective are:

  1. How to get large data into Dspace (Gigabytes and possibly even Terabytes)
  2. How to submit data composed of many different files (without zipping them up first)

Feedback from researchers gathered by our colleagues in the library has shown these issues to be very important, and critically, the ‘out of the box’ Dspace submission feature does not handle these very well.

To combat these, we are looking developing two different prototype submission tools:

  1. A SWORD based submission tool using Python
  2. A submission tool using the SWORD service document but then submitting via the Dspace command line import sccript

By developing each tool in parallel we hope to determine which works the best. We have so far found that while the Dspace command line tool can submit large files (successfully submitted a 6GB piece of data), the SWORD tool has hit upon some issues.

However, we hope to eventually have at least one fully working solution, if not two, that can be sued to submit data of any shape or size.

Posted under Big Data, Technical development

This post was written by Ian Wellaway on July 31, 2012

Comments are closed.

More Blog Post

Next Post: