SPRUCE mashup

Last week I went to a ‘mashup’ event for digital preservationists, organised as part of the SPRUCE project. The idea was to bring together digital collection owners and developers to discuss issues with digital preservation and to work on developing tools to solve these problems. Issues presented ranged from identifying scanned images which have black areas in them to running specialist software built for Windows 95, with many more in between.

While the developers worked on their solutions, the collection owners tried to develop a business case for digital preservation, looking at the benefits, stakeholders and skills gaps in our institutions, culminating in an ‘elevator pitch’ designed to be delivered to scary executives in lifts. By the time our 3 days were up the developers had been able to come up with tools that solved several of the problems we’d arrived with, resulting in some very happy collection owners.

The event was really helpful, and I came away with a much better sense of how to explain the value of digital preservation, as well as a tool which should speed up our capture process.  I’ve done a longer write-up for the futureArch blog if you would like to find out more.

Digital Preservation: What I Wish I Knew Before I Started

Last week I attended a student conference on digital preservation, hosted by the Digital Preservation Coalition. The event was called ‘What I Wish I Knew Before I Started’, and several digital preservationists gave some very interesting insights into the skills and challenges of digital preservation. The basic message was that digital preservation is a big task which requires urgent action, but that archivists already have many of the skills needed to carry it out. I’ve posted about it at the futureArch blog, if anyone is interested in finding out more.

Rebecca Nielsen, futureArch, Digital Archives

Hello! I’m Rebecca, and I’m the futureArch graduate trainee, based within BEAM (Bodleian Electronic Archives and Manuscripts).

I graduated last year from the University of Leicester, where I studied History. Since then I have worked at a GP surgery and volunteered at Shropshire Archives, where I was involved in a project on some Parish collections. This meant I got to work with a lot of old documents, including the relics of a key member of the early Methodist movement.

My role in futureArch is quite different, as I am dealing with born-digital material: things like websites, emails, Word documents, audio files, and digital photos. This means working with sites on the live web, as well as capturing files off CDs and floppy disks. I am also doing some more traditional archive work, assisting in the Special Collections Reading Room once a week and cataloguing paper collections using EAD. I’m looking forward to learning more about archives over the next year, especially the challenges which digital material presents to archivists.

Emma Hancox, futureArch project, Bodleian Library

Hi, I’m Emma and I’m the graduate trainee on the futureArch project. I studied History of Art at the University of Cambridge and graduated in 2007. Before starting at the Bodleian I worked at Saint Nicolas Place, a collection of Tudor buildings in Birmingham, where I sometimes got to dress up in period costume when helping out with community events.

The futureArch project is all about the management of born-digital materials in the Bodleian’s archival collections. This is vital for the future when you think about how many materials typically found in collections are now digitally produced. Just one example is email which has taken over from letter writing as the primary method of correspondence. As well as working on the digital side of things, I am also involved in more traditional archive work such as reading room duty and cataloguing paper collections.

I am really looking forward to contributing to the project and learning more about the Bodleian’s collections as well as finding out about the practicalities of digital preservation. After this year I hope to move on to do an MA in Archives and Records Management.

A day-in-the-life with the futureArch project

People have been talking about ‘day-in-the-life’ blog posts to give everyone a better idea of the different kinds of work we all do and I thought I’d kick things off. For those that don’t know, I’m the trainee for the futureArch project and I’m  going down the archiving,  rather than librarian route. From what I’ve heard from some of the other trainees, my job is quite different, so hopefully people will find this quite interesting. I think the big difference is that I’m only in a reading room dealing with enquiries and visitors one morning a week and the rest of the time I’m in an office and only have to talk to my colleagues!

So anyway, here is a typical Thursday: I’m based in Osney but on Thursday mornings I go over to the Logic School in the Bodleian Quad as I’m being taught how to use EAD, the cataloguing programme. I’ve been given a small collection of Victorian letters to catalogue first, which thankfully is a pretty standard collection. EAD is pretty simple to use, once you get used to all the little quirks and rules, like when inserting the scope/content you have to also insert a paragraph element within it before you can start writing. If you don’t, EAD has a bit of a tantrum and starts highlighting everything in red. It’s also very fussy about punctuation and where you can and cannot have a comma. The hardest part (but also the most fun)  is trying to read the signatures of all the correspondents and work out who they are so I can list them in the catalogue – there’s a lot of variation in the standard of handwriting. Thankfully most of them are famous enough to be in the Oxford Dictionary of National Biography, or as a last resort, wikipedia. It can be frustrating (one letter was signed ‘H.H.’) but it’s very rewarding when you finally work out who they are (for anyone interested H.H. turned out to be Hugh Haweis).

I finish at the Bodleian at 1pm and get back over to Osney (for which I love the minibus). In the afternoon I get on with my ongoing struggle with the CD imaging programme. The futureArch project is looking at born-digital archives (computer files, emails, MP3s etc.) and trying to develop an infrastructure for preserving and archiving these. CD imaging is a form of forensic imaging whereby a programme examines a disk to see what’s there without disturbing any of the metadata. For instance, we don’t want to change the last modified date, so we can’t just open all the files normally to see what’s there, besides which not all files are compatible with PCs. The programme also harvests and creates other metadata, like MD5 hash values. These are unique identifiers, so you can easily see if a file actually is a duplicate, or just looks the same. The programme can also retrieve deleted files (though not always intact), but this brings up a whole set of issues over the morality of archiving files the owner doesn’t realise we have. The imager also produces a copy of all the files, which eventually will be preserved in our secure server. This means we’re not reliant on computers still having CD drives in fifty years time to be able to view the files – just look at Amstrads: a lot of disks still exist, but if you haven’t got the computer they’re pretty much useless and the data on them is lost.

Anyway, I have a set of disks to image and we have a robotic loader, so in theory I should be able to pile them up and set the loader to put each one in the machine, image it, take it out and put the next one in, but of course technology is never that simple! Most of the time the loader seems to get confused and tries to remove the disk without first opening the disk drive and then decides to sulk and stop working. It also can’t seem to work out when it’s run out of disks and will bring up an error message and again stop working. I was getting a bit worried I’d pressed the wrong button or something, but my manager assures me other people who use it also complain about it being temperamental. Maybe it just doesn’t like being left in a room by itself.

When I get too annoyed with the loader, or just need a break I move on to one of my other tasks, which is sorting out one of the boxed collections. It’s in one of the cages across the hall and when it was archived nobody properly went through it, so I get the exciting job of going through the boxes and removing any staples and paperclips left in there and removing papers from metal ring binders and folders and putting then into plain card ones. A lot of the metal has started to rust and it’s quite dirty work, but it makes a change to do something manual that doesn’t require much brain power and it gets me away from the computer screen for a while. I normally spend the afternoon alternating between the imaging and the box sorting and then comes home time!

There are other jobs I do on different days of the week, but that’s a pretty typical Thursday. It’d be great to hear what other people get up to and compare it.

Victoria Sloyan, futureArch Project


Hi everyone. I’m the graduate trainee for the futureArch project, which is about developing the Bodleian’s archive system so that it can cope with hybrid and born-digital archives. I’m really excited about getting to work on this project and hope I can be useful to the team. I’m also going to be helping out in the Special Collections Reading Room in the Bodleian and am really looking forward to exploring the stacks.

Before this I worked for the Living Archive in Milton Keynes and before that I did a history degree at Lancaster. I still live in Milton Keynes and am commuting to Oxford, which involves getting up a lot earlier than I’m used to!