An Introduction to the Graduate Trainee Digital Archivist Programme

The position of Graduate Trainee Digital Archivist within the Special Collections department of the Bodleian Libraries was a new role developed in 2014. It combines archival work with study towards a postgraduate diploma in Archives Administration.

There are currently two Graduate Trainee Digital Archivists, myself (Harriet) and Emily. A typical week for us involves:

  • Updating the Bodleian’s Collections Management Database with information from our twentieth-century accessions registers
  • Assisting the Oxfam archivists with the appraisal and cataloguing of Oxfam’s communications work
  • Invigilating in the Charles Wendell David Reading Room, where Oriental manuscripts and Commonwealth and African Archives are consulted
  • Listing, arranging, repackaging and cataloguing small collections
  • Seeking permissions for, and archiving, web sites which relate to the Bodleian’s collecting focuses
  • Working on our joint development project of improving and enhancing the Bodleian’s Collections Management Database. This involves working with a software developer to implement the necessary changes identified through consulting different users

In addition to this, we also have an afternoon a week dedicated to our studies. We use this time to work on our assignments through reading pertinent professional literature and producing reports and essays at determined intervals. As a result, we will finish our two-year contract here as qualified Archivists.

As we continue, we will also soon be involved in capturing digital collection material into the Bodleian’s Electronic Archives and Manuscripts digital repository. This will include such tasks as digitising and processing audio-visual material and ingesting and weeding data stored on deposited hardware.

For me, the best aspect of the traineeship is the variety of work we are able to do. We also have the opportunity to shape our time here to reflect the skills we wish to develop, and this has led to me assisting with certain outreach initiatives which I have really enjoyed. Furthermore, conferences, training and the Graduate Trainee sessions have introduced us to the processes and initiatives of the Bodleian, the University and the wider professional community, and helped us contextualise our work within the information management sector, as well as providing us with an understanding of the careers and opportunities available outside of and beyond the traineeship. As a result, I have been able to consider what I might like to focus on in the future, and can already see how valuable my experiences here will be when I begin my career as a professional Archivist.

Emma Stanford, BDLSS

Image
One of our Gutenberg Bible images, CC BY-NC-SA Bodleian Libraries.

Hello! I’m Emma, and I’m the trainee in Bodleian Digital Library Systems and Services. I’m from Washington State in the U.S., and I did my B.A. in literature at Middlebury College. During my degree I worked in our interlibrary loan department and studied for a year at Oxford, and after graduating I spent a year working in electronic reserves and copyright processing for a library in California.

My position is new this year, so it’s a bit undefined, but basically I’m working with the digitization department at BDLSS. We’re doing a partnership with the Vatican Library to digitize millions of objects starting this year, and we’ll be working on making these easily accessible to the public. A lot of what I’ll be doing once we get started is processing the images and assigning metadata (page numbers, content labels, etc.), but so far I’ve been working a bit on the project website and reading a LOT about metadata and digitization standards. Today I learned how to retrieve images from the archive. The images we’re using are very high quality, so they take up a lot of space, and they’re actually stored on tapes that get physically fetched by a robot every time we need to copy something from the archive. This happens much more quickly than I would have expected–it only takes a few minutes.

The people I’m working with are a lot more tech-savvy than I am, so I’m looking forward to learning more about the software and languages we’ll be using. I’m also excited to be dealing with such beautiful images, and to be involved in the effort to make them more accessible.

BIALL, CLSIG, SLA Europe Open Day 2013 part 1

Kat Steiner here again, one of the graduate trainees at the Bodleian Law Library. On Wednesday, Frankie Marsden and I headed down to London for the BIALL, CLSIG, SLA Europe Open Day, a day of presentations and tours based at the CILIP headquarters near Russell Square. We thought we’d give you a few of our thoughts on the day, especially on what we individually will take away from it.

A few acronym explanations before we start. BIALL is the British and Irish Association of Law Librarians, CILIP is the Chartered Institute of Library and Information Professionals, CLSIG is a special interest group within CILIP standing for Commercial, Legal and Scientific Information Group, and SLA Europe is the European and UK division of the Special Libraries Association. Still with me? Just the names alone were a lot to take in!

Copyright Wellcome Library
The Wellcome Library

Over the day, we heard 9 speakers, whose places of work included London law firms, the Law library of City University, the Wellcome Library, the British Medical Association, the Inner TempleLinex (a company offering current awareness tools and aggregation for subscribers), and the British Library. It was fascinating to hear the stories of how they had reached their current jobs (often by a combination of luck, enthusiasm and perseverance), and their varied positions. It particularly stood out to me how many people mentioned TFPL, a recruitment agency, as being invaluable in helping them find jobs. I hadn’t heard of them, but I will definitely be looking into them now!

There was also the opportunity to go on a tour of either the Wiener Library, a collection for the study of the holocaust & genocide, the library of the London School of Hygiene & Tropical Medicine, or the library of the Institute of Advanced Legal Studies. As Law Bod trainees, Frankie and I both chose the IALS, and enjoyed a detailed tour and talk by David Gee, the Deputy Librarian. As the library takes three graduate trainees every year, he had a lot of insight and suggestions for what to do afterwards if you are thinking of going into law librarianship.

Several speakers were also from law firm libraries, or law librarians in other institutions, and it was very interesting to hear about their jobs in detail. I hadn’t personally thought much about specialising, or moving away from academic librarianship (I’m hoping to stay at the Bodleian while I do my library school masters), but there definitely seemed to be a lot to recommend ‘special libraries’. The chance to do real legal research was very attractive to me as an academic challenge (at the Law Bod, students are expected to do their own research, although there are lots of classes to help them learn how to do it). However, I’m not sure I could cope with the increased pressure, longer hours and difficult deadlines that come along with it. The rather better pay might sweeten the pill, though.

Copyright Inner Temple Library
The Inner Temple Library

The talk that really stood out for me was from Simon Barron, a Project Analyst at the British Library. He focused on the concept of  ‘digital librarians’, and the way that technology is transforming the information profession and will continue to do so. In the days of ‘big data‘ (a current buzzword that I’m still not hugely clear on – in my understanding, it can mean data sets so large that they allow statistical programs to crunch through them and draw remarkably accurate conclusions without any attempt at explaining how the causation between the conclusions and the data works), librarians who can code, use technology, and be willing to learn new technological skills will be more and more in demand. He described his current project with the British Library and the Qatar Foundation to create a digital National Library of Qatar. This is an ambitious project, involving huge numbers of documents to be digitised, including 14th- and 15th-century Arabic manuscripts. Simon’s job seemed to involve a lot of technological problem-solving, for example ‘how do we get this data out of this piece of software and into this other piece of software without losing it, or having to do it by hand’. He explained that his coding knowledge was entirely self-taught through Codecademy and that, although he didn’t consider it his crowning achievement, his colleagues were still very impressed when he made a spreadsheet where the boxes change colour depending on the data you enter.

Simon’s talk made a big impression on me, and really confirmed my feeling that the MSc in Information Science is for me. I have some basic experience with coding good practice (a 10-week internship at a software company, writing code in Perl), and the main thing I took away is that it’s really not that hard or scary, it just requires logic, perseverance (read: stubbornness even when it doesn’t work), and the willingness to have a go even if you’re not sure what you’re doing. I believe anyone who really wants to can learn to use technology, but they may not see the point. Simon emphasised the use of technology to automate what would be fairly simple human processes. This is a great point – if you can automate a simple action on a computer (for example, removing formatting from a text file, or averaging each row in a spreadsheet), you not only save time, you make the process scaleable to much larger sets of data, which would take humans far too long to deal with, and you reduce the possibility of human error, as long as your code actually works!

Anyway, you can see that this made quite an impression. Another thing I will take away is how many things are worth joining to get more involved in the information profession. You can join CILIP for £38 a year if you’re a student or graduate trainee, definitely worth doing! You can join SLA (of which SLA Europe is a chapter) for $40 a year if you’re a student (even part-time, but I’m not sure about graduate trainees). You can join BIALL for £17 a year if you are a full-time student. You might want to consider registering with TFPL. SLA Europe offers an Early Career Conference Award, which three of the speakers had won, allowing them to go to amazing conferences in San Diego, Chicago and Philadelphia. BIALL also offers an award for the best library school dissertation on a legal topic. And, finally, Information Architect is a job title it might be worth looking out for.

That’s pretty much all I have to say for this post (I’ve waffled for more than long enough). Frankie will be talking about the aspects of the day that she really liked, and I’m sure they will be very different! I just want to thank everyone who helped organise the conference – it gave me loads to think about, allowed me to meet plenty of other graduate trainees, and generally have a great time. For anyone who wants a more general idea of the day – the slides from the presentations that everyone gave can be found on the CLSIG website.

SPRUCE mashup

Last week I went to a ‘mashup’ event for digital preservationists, organised as part of the SPRUCE project. The idea was to bring together digital collection owners and developers to discuss issues with digital preservation and to work on developing tools to solve these problems. Issues presented ranged from identifying scanned images which have black areas in them to running specialist software built for Windows 95, with many more in between.

While the developers worked on their solutions, the collection owners tried to develop a business case for digital preservation, looking at the benefits, stakeholders and skills gaps in our institutions, culminating in an ‘elevator pitch’ designed to be delivered to scary executives in lifts. By the time our 3 days were up the developers had been able to come up with tools that solved several of the problems we’d arrived with, resulting in some very happy collection owners.

The event was really helpful, and I came away with a much better sense of how to explain the value of digital preservation, as well as a tool which should speed up our capture process.  I’ve done a longer write-up for the futureArch blog if you would like to find out more.

Digital Preservation: What I Wish I Knew Before I Started

Last week I attended a student conference on digital preservation, hosted by the Digital Preservation Coalition. The event was called ‘What I Wish I Knew Before I Started’, and several digital preservationists gave some very interesting insights into the skills and challenges of digital preservation. The basic message was that digital preservation is a big task which requires urgent action, but that archivists already have many of the skills needed to carry it out. I’ve posted about it at the futureArch blog, if anyone is interested in finding out more.

Trainee Project Showcase: Graduate Trainee Projects in the Science Libraries

On Wednesday, we concluded our traineeship through the presentation of the projects that we had worked on throughout the year. It was a wonderful opportunity to see what everyone had been working on in their libraries.

I presented my project on the digitization of the Birthday Book of George Claridge Druce (1850-1932), chemist, Mayor of Oxford and one of the great botanists of the early 20th Century, which I worked on at the Sherardian Library in the Department of Plant Sciences. I greatly enjoyed working on this project, and learned a lot, not only about Druce (a most remarkable man), but about the practice of botany in Britain during the early 20th Century (shift from natural history as collecting to a circumscribed science, and the rise of the conservation movement to preserve rare specimens in the wild rather than just collecting them).

I also learned how to design and implement databases in Access and learned some basic XML coding. The next step will be uploading the Druce database to the UK Archives Hub, where it will be made available for research.

One of my other projects at the Radcliffe Science Library involved making a virtual tour of the library, which was used during the Science Open Days at the RSL when prospective undergraduates visit the library and science departments at Oxford. The virtual tour was done using Powerpoint and Adobe Captivate. You can view it at the following link:

Radcliffe Science Library Virtual Tour

Bodleian Libraries Imaging Studio

On Thursday I joined a tour of the imaging studio at Osney. The tour was led by James Allan, Head of Imaging Services at the Bodleian.

The studio was established in the late nineteenth century by Oxford University Press, and was taken over by the University in the 1970s. It has recently moved to Osney, where it will remain until the refurbishment of the New Bodleian is completed in 2014.

The imaging services team produce digital and print copies of resources held by the Bodleian and other Oxford libraries. They provide services for individuals and institutions both inside and outside the University, and have also been involved in larger projects, such as the production of digital images of the Bodleian’s Medieval and Renaissance Manuscripts. The team is also responsible for negotiating copyright permissions for the images that they produce.

Imaging Services is currently part of Special Collections at the Bodleian, and much of the material that the team work with is drawn from these collections; during the tour, we saw a fourteenth-century illuminated manuscript being photographed.

A variety of equipment is used in the studio, from a bitonal scanner to a high resolution (39 megapixel) digital camera. Post-production software is used to clean-up images. Most impressive was the cradle designed to hold a book as its pages are photographed. This features a vacuum bar which applies gentle suction to the back of a page, holding it in place whilst a photograph is taken.

The imaging technology used by the team is constantly evolving. However, new copies of items cannot be made every time the technology moves forward: funding is not available to do this, and the materials involved are often too fragile to withstand frequent handling. Difficult decisions must therefore be taken about when it is best to photograph or scan items in order to produce images that will remain useful for some years to come.

The team maintains an archive of images of material held by the Bodleian, which includes an extensive collection of photographic plates dating back to the 1950s; there are also large microfilm and digital collections. Images from the archive are often used to fulfil requests to view items that are too fragile to be handled or copied again. The digital archive is not yet accessible online, but there are plans to make this possible in the future.

My thanks go to James Allan for the very informative tour. More information about the services provided by his team can be found on the Bodleian website.

A day-in-the-life with the futureArch project

People have been talking about ‘day-in-the-life’ blog posts to give everyone a better idea of the different kinds of work we all do and I thought I’d kick things off. For those that don’t know, I’m the trainee for the futureArch project and I’m  going down the archiving,  rather than librarian route. From what I’ve heard from some of the other trainees, my job is quite different, so hopefully people will find this quite interesting. I think the big difference is that I’m only in a reading room dealing with enquiries and visitors one morning a week and the rest of the time I’m in an office and only have to talk to my colleagues!

So anyway, here is a typical Thursday: I’m based in Osney but on Thursday mornings I go over to the Logic School in the Bodleian Quad as I’m being taught how to use EAD, the cataloguing programme. I’ve been given a small collection of Victorian letters to catalogue first, which thankfully is a pretty standard collection. EAD is pretty simple to use, once you get used to all the little quirks and rules, like when inserting the scope/content you have to also insert a paragraph element within it before you can start writing. If you don’t, EAD has a bit of a tantrum and starts highlighting everything in red. It’s also very fussy about punctuation and where you can and cannot have a comma. The hardest part (but also the most fun)  is trying to read the signatures of all the correspondents and work out who they are so I can list them in the catalogue – there’s a lot of variation in the standard of handwriting. Thankfully most of them are famous enough to be in the Oxford Dictionary of National Biography, or as a last resort, wikipedia. It can be frustrating (one letter was signed ‘H.H.’) but it’s very rewarding when you finally work out who they are (for anyone interested H.H. turned out to be Hugh Haweis).

I finish at the Bodleian at 1pm and get back over to Osney (for which I love the minibus). In the afternoon I get on with my ongoing struggle with the CD imaging programme. The futureArch project is looking at born-digital archives (computer files, emails, MP3s etc.) and trying to develop an infrastructure for preserving and archiving these. CD imaging is a form of forensic imaging whereby a programme examines a disk to see what’s there without disturbing any of the metadata. For instance, we don’t want to change the last modified date, so we can’t just open all the files normally to see what’s there, besides which not all files are compatible with PCs. The programme also harvests and creates other metadata, like MD5 hash values. These are unique identifiers, so you can easily see if a file actually is a duplicate, or just looks the same. The programme can also retrieve deleted files (though not always intact), but this brings up a whole set of issues over the morality of archiving files the owner doesn’t realise we have. The imager also produces a copy of all the files, which eventually will be preserved in our secure server. This means we’re not reliant on computers still having CD drives in fifty years time to be able to view the files – just look at Amstrads: a lot of disks still exist, but if you haven’t got the computer they’re pretty much useless and the data on them is lost.

Anyway, I have a set of disks to image and we have a robotic loader, so in theory I should be able to pile them up and set the loader to put each one in the machine, image it, take it out and put the next one in, but of course technology is never that simple! Most of the time the loader seems to get confused and tries to remove the disk without first opening the disk drive and then decides to sulk and stop working. It also can’t seem to work out when it’s run out of disks and will bring up an error message and again stop working. I was getting a bit worried I’d pressed the wrong button or something, but my manager assures me other people who use it also complain about it being temperamental. Maybe it just doesn’t like being left in a room by itself.

When I get too annoyed with the loader, or just need a break I move on to one of my other tasks, which is sorting out one of the boxed collections. It’s in one of the cages across the hall and when it was archived nobody properly went through it, so I get the exciting job of going through the boxes and removing any staples and paperclips left in there and removing papers from metal ring binders and folders and putting then into plain card ones. A lot of the metal has started to rust and it’s quite dirty work, but it makes a change to do something manual that doesn’t require much brain power and it gets me away from the computer screen for a while. I normally spend the afternoon alternating between the imaging and the box sorting and then comes home time!

There are other jobs I do on different days of the week, but that’s a pretty typical Thursday. It’d be great to hear what other people get up to and compare it.

Sarah Hogg, Women’s Health Library

Sarah

Hello, I’m Sarah and I am the library trainee at the Women’s Health Library, based at the Nuffield Department of Obstetrics and Gynaecology at the John Radcliffe Hospital. The library is an online resource, and part of an NHS wide service called ‘NHS Evidence’ and was created as part of the ‘High Quality Care for All Report’ from Lord Darzi who stated that “All NHS Staff will have access to a new NHS Evidence service where they will be able to get, through a single web-based portal, authoritative clinical and non-clinical evidence and best practices” (NHS Evidence webpage).

With librarians and Clinical Leads/Fellows working on the project, it is designed to provide professionals involved in the care of woman access to the best current knowledge available, and is checked and updated regularly. I look forward to learning more about the ways a digital library works, and doing a project which will benefit the library.

Previously, I was working in the NHS as a health records clerk whilst studying for my LL.M in Commercial Law; and before this, I read English Language at Lancaster University.