CHARTER Rotating Header Image

‘“Insufficient data?” Can’t you just make an educated guess?’

– Invader Zim (cartoon)

Not when it comes to repositories.  As DSPACE is now live, I thought it was time I finally wrote the metadata post I said I’d do in my last post.

The notion of metadata being ‘data about data’  perhaps belies how complex it is, reinforcing JISC Digital Media’s preference for defining it as ‘structured data about data’.  This structure is the key part in making the data actually work, be interoperable with other data and ensuring that it is sufficient so no educated guesses are needed!

DSPACE uses Dublin Core, one of the most simple metadata schemas with only 15 elements.  However, the material being digitized comes from three different sources, each with their own cataloguing standard (MARC21, ISAD(G) and SPECTRUM), meaning they needed mapping to Dublin Core.  Dublin Core has a number of qualifiers which are used to refine the elements and are particularly useful when mapping other standards.  We have used these qualifiers to ensure that the Dublin Core elements are a true representation of those in the original standard.

It is this simplicity and adaptability that makes the Dublin Core schema so widely used and so help exists online for the mapping of many different standards to it.  However, there was still a lot of tweaking to be done to make it work for us, especially as we did not need the entire standards mapped for CHARTER, only what we were going to be using.  Practical testing proved to be very important with a great deal of backwards and forwards between Ahmed and myself.

I found the MARC21 mapping the hardest to get to grips with due to the numerical and positional nature of the standard.  Because of this, I developed a cheat sheet of three columns, one with our DC labels, the MARC21 field and one for my own notes on how to complete the field.  I found it such a useful tool that I created cheat sheets for SPECTRUM and ISAD(G) too.

Further structure is needed for the information that is going into the Dublin Core fields.  This is the small, but important stuff such as ‘What format will the date take?’ ‘What about in free text?’ and ‘What order will the dimensions go in?’  It’s all about standards, uniformity and interoperability (as well as, of course, describing the material and aiding users in discovering it).

And I haven’t even covered the technical metadata so I’ll keep it short.  Our technical metadata is embedded in the image files themselves, as well as being exported, and is managed through Adobe Bridge.  This metadata is the EXIF (Exchangeable Image File Format) information which is automatically generated when the image is captured.  In addition to this, using the IPTC (International Press Telecommunications Council) schema which is bundled with Bridge, we embedding the minimum information required to identify the file and its contents in the event that it becomes removed from its context.

The say ‘the devil is in the details’ which seems to be true of metadata.  I came out with the phrase ‘small spanners cause big problems’.  One thing I have noticed is that what seems to be a small issue ends up having big repercussions yet something that seems big ends up being easy to deal with.

And so, with nearly all the ‘t’s crossed and ‘j’s dotted, it will be full steam ahead on filling up the repository.

[Metadata Workflow Guidelines are being produced]

Live service now available!

The live CHARTER repository is now available at collections.exeter.ac.uk.

The service has now been handed over to the IWS team for customisation of the user interface.

Digitisation Workflow and Guidelines

A revised version of the digitisation workflow and guidelines document has been published on ERIC and the CHARTER project web site. This document outlines the workflow and best practice required to implement the digitisation of physical objects as part of the CHARTER project. It details the CHARTER scanning requirements and parameters for the creation of preservation master files and compressed images for viewing.

A brief introduction to the Digitisation Team

As we have been here two months now, we thought it would be a good idea to introduce ourselves on the blog and give an overview of what we’ve been up to since starting in January. We, as the digitisation team are comprised of Victoria Oxberry, the Digitisation Officer, and Victoria (known as Vicki to reduce confusion) Stobo, the Digitisation Assistant, our names being quite appropriate for a project on Victorian material, though we have been assured that’s not why we were hired. We are overseen by Ahmed Abu-Zayed but based in Special Collections.

The first week or so was spent undertaking inductions and getting to know the university and campus. We also used this time to begin familiarising ourselves with the equipment/software we would be using. As we had both used variants of much of this, most time was spent with the camera, playing with the different settings and basically getting to know it.

After this, we set up a formal testing plan and began keeping work journals to detail our experiments and their outcomes, any problems encountered etc. Looking back over these, it is easy to see why this has been a good idea as it documents our changing knowledge and in some cases the to-ing and fro-ing over what the best practice should be. It also highlights how you can’t just dive head first into a digitisation project without doing some groundwork first. Even now, we keep changing our minds on some things or realising something needs to be altered in order to work in a particular context. For instance, creating a file naming convention that fits the three different sources it’s a lot more complicated than you think it will be.

Metadata testing has been worked into this experimentation, testing out the different mapped metadata schemas and providing feedback on these (a more detailed blog on this will appear at some point, along with one about capture settings).

Out of all this, we have come up with best standards for most of the different material we will be working with, though these standards will needed to be tweaked here and there for specific items, but importantly, we have a solid starting point from which to do this. This has also helped us to get know the material and create a workflow specifically tailored to that material and the overall project outcome. It could also be carried forward into future projects, again as a starting point.

We have also now looked at most actual items that have been selected and have notes on how we will approach the capture of each one and the particular parts that we are to capture.

As the e-learning module is one of the key outcomes of the CHARTER project, we have prioritised items selected for the course this will be based on and are now about to embark on the definitive capture of these items.

Project Plan Updated

Following JISC approval to use DSpace instead of Fedora, an updated project plan has been created and can be viewed here. http://projects.exeter.ac.uk/charter/documents.htm

Website Update

The CHARTER website now contains all the minutes and actions from board and team meetings under the ‘Documents’ tab.

CHARTER Project Board

The CHARTER Project Board met for the first time on 10 December 2008. Advice was sought from the Board on topics ranging from the repository software, metadata standards and selection criteria, as well as on longer term outcomes which might arise from the project. The Board includes CHARTER’s institutional sponsor, the academic lead for the project and an expert in technology, heritage and education. Perhaps most importantly, the academic lead for CHARTER again underlined the teaching and learning needs that are driving the project forward.

Update on Repository

Following a meeting of Technical Sub-Group last week, CHARTER will be making the final decision on which repository software to use at its next Project Team meeting (17 December). Ray Burnley is currently scoping fedora and DSpace. Approval will be needed from JISC for any change to the plans submitted. Discussion will also take place at the first Project Board meeting (12 December).

Report from Programme Meeting

I attended the JISC Programme Meeting on 18/19 November in Oxford. The papers presented are now available online at: 
http://www.jisc.ac.uk/whatwedo/programmes/digitisation/events.aspx

These include presentations on Project Management, User Engagement and Web Userbility.

Project Initiation Document

On 29 October 2008, the CHARTER Project Board signed off the Project Initiation Document. Whilst considerable work has been ongoing in the background, this milestone sees formal agreement of the scope and effort required to complete this project. The final PID will be uploaded to the project website when that is fully operational.