Tag Archives: #SkillsForTheFuture

What’s it like to be a trainee? Francesca Miller, Graduate Trainee Digital Archivist 2020-2022

I discovered the Graduate Trainee Digital Archivist role while looking for jobs in Oxford that gave opportunities for training and learning. Though I didn’t know much about digital archiving, as soon as I read the advert I knew it was the job for me and I was delighted when I got it. My background is quite varied, having degrees in Graphic Design and Mathematics and being part way through my MSc in Maths, all while working in the financial sector. I decided I wanted a change of direction and the traineeship was the perfect opportunity to use my skills and knowledge in an interesting and developing field.

Having no previous experience of archives hasn’t been a barrier as the work introduces you through on the job training and the PGDip at Aberystwyth University. While studying and working at the same time can be demanding, the course compliments the job and I already feel like I know so much more. While there has be some aspects of the role I haven’t be able to do, due to remote working, there is always plenty of work and I have learnt so much in 7 months. Working alongside the other two trainees has been really enjoyable and our twice-weekly web archiving session via teams has worked really well. They have both been very supportive and are always willing to answer my questions and navigate me through the complexity that is web archiving. The role has enabled me to expand on my coding skills by learning XML as part of a retro-conversion project and learn new skills around cataloguing and indexing.

I started my traineeship in September 2020 and like a lot of what has happened in the last year it hasn’t gone quite as expected! Due to the pandemic, I haven’t been into the Weston Library since my interview which took place prior to the first lockdown. Despite working from my home and not meeting my colleagues in person, I have been made to feel welcome and part of the digital archiving team. I am very much looking forward to discovering more aspects of digital archiving during the rest of my traineeship and I hope very soon to be able to experience the Bodleian Library in person!

Francesca Miller, May 2021

What’s it like to be a trainee? Simon Mackley, Graduate Trainee Digital Archivist 2020-2022

I applied to become a Graduate Trainee Digital Archivist because I wanted a route into the profession that would also equip me with the skills for working with archives in the digital age. I have always had a keen interest in the past: I originally trained as a historian, completing a PhD in British imperial history at the University of Exeter in 2016. The following year I got my first archives job, working as an Archives Assistant as part of the Conservative Party Archive team at the Bodleian. I found the work fascinating and really rewarding, and it didn’t take me long to decide that I wanted to pursue this as a career. When the opportunity to apply for the traineeship came up, I jumped at the chance!

One of the great things about being a digital archives trainee is that you get to work across a wide range of projects and collections. For instance, my work over the past year has included reviewing, indexing, and publishing a hugely diverse range of catalogue records as part of the Summary Catalogue project, as well as more technical tasks such as retro-converting the historic catalogues of the University Archives to make them machine-readable. A particular highlight for me has been working on the Bodleian Libraries’ Web Archive. Not coming from a technical background, this was a completely new area for me. However, working on the web archive has become one of my favourite parts of the traineeship, and I have really enjoyed developing a new set of skills in this area.

I have also found studying for the postgraduate diploma at Aberystwyth University really interesting, and the topics covered often prove very useful in my day-to-day work. Balancing distance learning with full-time work can be challenging at times, but you get plenty of support from your fellow trainees and the tutors at Aberystwyth. As a digital archives trainee, you also get to take part in the Oxford Libraries Graduate Trainee programme, and I have really enjoyed having the chance to learn more about the wider work of the Bodleian and College libraries.

Obviously, working during the Coronavirus pandemic has brought with it its own unique challenges: my first day in the role ended up coinciding with the start of the first national lockdown! Fortunately, the nature of digital archives means that there has still been plenty of tasks to get on with while working from home. The Aberystwyth University course has also been able to continue throughout the pandemic, so I’ve not had any disruption to my studies.

I am now halfway through the traineeship, and looking back on the past year I am amazed at how much I have already learned. I really value the opportunities I’ve had in this role, and I cannot wait to see what the next year has in store!

Simon Mackley, April 2021

Web Archiving & Preservation Working Group: Social Media & Complex Content

On January 16 2020, I had the pleasure of attending the first public meeting of the Digital Preservation Coalition’s Web Archiving and Preservation Working Group. The meeting was held in the beautiful New Records House in Edinburgh.

We were welcomed by Sara Day Thomson who in her opening talk gave us a very clear overview of the issues and questions we increasingly run into when archiving complex/ dynamic web or social media content. For example, how do we preserve apps like Pokémon Go that use a user’s location data or even personal information to individualize the experience? Or where do we draw the line in interactive social media conversations? After all, we cannot capture everything. But how do we even capture this information without infringing the rights of the original creators? These and more musings set the stage perfectly to the rest of the talks during the day.

Although I would love to include every talk held this day, as they were all very interesting, I will only highlight a couple of the presentations to give this blog some pretence at “brevity”.

The first talk I want to highlight was given by Giulia Rossi, Curator of Digital Publications at the British Library, on “Overview of Collecting Approach to Complex Publications”. Rossie introduced us to the emerging formats project; a two year project by the British Library. The project focusses on three types of content:

  1. Web-based interactive narratives where the user’s interaction with a browser based environment determines how the narrative evolves;
  2. Book as mobile apps (a.k.a. literary apps);
  3. Structured data.

Personally, I found Rossi’s discussion of the collection methods in particular very interesting. The team working on the emerging formats project does not just use heritage crawlers and other web harvesting tools, but also file transfers or direct downloads via access code and password. Most strikingly, in the event that only a partial capture can be made, they try to capture as much contextual information about the digital object as possible including blog posts, screen shots or videos of walkthroughs, so researchers will have a good idea of what the original content would have looked like.

The capture of contextual content and the inclusion of additional contextual metadata about web content is currently not standard practice. Many tools do not even allow for their inclusion. However, considering that many of the web harvesting tools experience issues when attempting to capture dynamic and complex content, this could offer an interesting work-around for most web archives. It is definitely an option that I myself would like to explore going forward.

The second talk that I would like to zoom in on is “Collecting internet art” by Karin de Wild, digital fellow at the University of Leicester. Taking the Agent Ruby – a chatbot created by Lynn Hershman Leeson – as her example, de Wild explored questions on how we determine what aspects of internet art need to be preserved and what challenges this poses. In the case of Agent Ruby, the San Francisco Museum of Modern Art initially exhibited the chatbot in a software installation within the museum, thereby taking the artwork out of its original context. They then proceeded to add it to their online Expedition e-space, which has since been taken offline. Only a print screen of the online art work is currently accessible through the SFMOMA website, as the museum prioritizes the preservation of the interface over the chat functionality.

This decision raises questions about the right ways to preserve online art. Does the interface indeed suffice or should we attempt to maintain the integrity of the artwork by saving the code as well? And if we do that, should we employ code restitution, which aims to preserve the original arts’ code, or a significant part of it, whilst adding restoration code to reanimate defunct code to full functionality? Or do we emulate the software as the University of Freiburg is currently exploring? How do we keep track of the provenance of the artwork whilst taking into account the different iterations that digital art works go through?

De Wild proposed to turn to linked data as a way to keep track of particularly the provenance of an artwork. Together with two other colleagues she has been working on a project called Rhizome in which they are creating a data model that will allow people to track the provenance of internet art.

Although this is not within the scope of the Rhizome project, it would be interesting to see how the finished data model would lend itself to keep track of changes in the look and feel of regular websites as well. Even though the layouts of websites have changed radically over the past number of years, these changes are usually not documented in metadata or data models, even though they can be as much of a reflection of social and cultural changes as the content of the website. Going forward it will be interesting to see how the changes in archiving online art works will influence the preservation of online content in general.

The final presentation I would like to draw attention to is “Twitter Data for Social Science Research” by Luke Sloan, deputy director of the Social Data Science Lab at the University of Cardiff. He provided us with a demo of COSMOS, an alternative to the twitter API, which  is freely available to academic institutions and not-for-profit organisations.

COSMOS allows you to either target a particular twitter feed or enter a search term to obtain a 1% sample of the total worldwide twitter feed. The gathered data can be analysed within the system and is stored in JSON format. The information can subsequently be exported to a .CVS or Excel format.

Although the system is only able to capture new (or live) twitter data, it is possible to upload historical twitter data into the system if an archive has access to this.

Having given us an explanation on how COSMOS works, Sloan asked us to consider the potential risks that archiving and sharing twitter data could pose to the original creator. Should we not protect these creators by anonymizing their tweets to a certain extent? If so,  what data should we keep? Do we only record the tweet ID and the location? Or would this already make it too easy to identify the creator?

The last part of Sloan’s presentation tied in really well with the discussion about the ethical approaches to archiving social media. During this discussion we were prompted to consider ways in which archives could archive twitter data, whilst being conscious of the potential risks to the original creators of the tweets. This definitely got me thinking about the way we currently archive some of the twitter accounts related to the Bodleian Libraries in our very own Bodleian Libraries Web Archive.

All in all, the DPC event definitely gave me more than enough food for thought about the ways in which the Bodleian Libraries and the wider community in general can improve the ways we capture (meta)data related to the online content that we archive and the ethical responsibilities that we have towards the creators of said content.

Because Digital Objects can Decay too: Conducting a Proof of Concept for Archivematica

Like other archives, the Bodleian Libraries has been searching for ways to optimize the conservation of our digital collections. The need to find a solution has become increasingly pressing as the Bodleian Electronic Archives and Manuscripts (BEAM), our digital repository service for the management of born-digital archives and manuscripts acquired by the Special Collections, now contains roughly 13TB worth of digital objects, with much more waiting in the wings.

In order to help us manage the ingest of digital objects within our collections, the Bodleian Libraries undertook an options review as part of its DPOC project. This lead to a decision to conduct a proof of concept of Archivematica. This proof of concept included the installation of a QA and DEV environment with the help of Artefactual followed by an extensive testing period and a gap analysis.

In November 2018 we started testing the system to establish whether or not Archivematica met our acceptance criteria. We mainly focussed on three areas:

  1. Overall performance/ functionality: Is the system user friendly? Can it successfully process all the different file types and sizes that we have in our collection?
  2. Metadata: Can Archivematica extract the metadata from the Excel sheets that we have created over time? What technical metadata does Archivematica automatically extract from ingested files?
  3. File extraction and normalization: Are disk images extracted properly? Is the content of a transfers normalized to the right file type?

Whilst testing, we also reached out to and visited other organisations that had already implemented Archivematica as well, including the International Institute of Social History in Amsterdam, the University of Edinburgh, the National Library of Wales and the Wellcome Trust.

Based on the outcomes of the tests we conducted, and the conversations we had with other institutions, we identified five gap areas:

  1. Performance: The Archivematica instance we configured for the Proof of Concept struggled with transfers over 200GB or transfers that contain over 5000+ files.
  2. Error reporting: It was often unclear what a particular error code and message meant. The error logs used by system administrators are also verbose, making it hard for them to pinpoint the error.
  3. Metadata: Here we identified two gaps. Firstly, there is the verbosity of the metadata. Because Archivematica records individual PREMIS events for each digital file, the resulting METS file becomes unwieldy, compromising the system’s performance. Secondly, we require a workflow to migrate our spreadsheet-held legacy pre-ingest capture metadata and file-level metadata into Archivematica, and to go on including this pre-ingest metadata, which will continue to be recorded in spreadsheet form for the foreseeable, in future ingests.
  4. User/ access management: Archivematica does not offer a way to manage access to collections or Archive Information Packages, and allows all users to alter the system work-flow. We are a multi-user organisation, and wish to have tighter controls on access to collections and workflow configurations.
  5. General reporting: Archivematica currently does not offer many reports to monitor progress, content and growth of collections.

Once we identified these gaps we had an intensive two day workshop with Artefactual to pinpoint possible solutions, which we subsequently presented to the wider Archivematica community during the Archivematica Camp in London in July 2019.

We will use all the input gathered from the proof of concept to inform our initial implementation of Archivematica, which will begin in January 2020. The project will focus on the performance and metadata gaps identified during the proof of concept, allowing us to bring Archivematica into production use 2021. We are keen to work with the Archivematica community, so do get in touch at beam@bodleian.ox.ac.uk if you’re interested in finding out more about our work.

Update from Carl Cooper, former Graduate Trainee Digital Archivist 2017-2019

On finishing the Traineeship at the Bodleian I secured a position as Assistant Archivist at the Bank of England. Currently I am working on a large scale project accessioning records from Record Management into the Archive, this involves exporting, cleansing and enriching the metadata from the Records Management Database and importing this into Calm, creating authority terms, locations and archival hierarchal structures. The material then needs to be physically moved from the Head Office Record Centre into the Archive, labelled and re-boxed. My experience at the Bodleian and from the course has enabled me to streamline and inform decisions in regards to the workflow and processes for this project. It was enormously beneficial to have previously worked on large scale collections with complex series structures. This has enabled me to tackle similar challenges in my new role. I have had the opportunity to collaborate and meet with a variety of different people whilst working at the Bank due to the varied opportunities and tasks. I recently took part in Museums at Night meeting with the public to talk about our collections and attended the ARA conference in Leeds with my colleagues from the Bank. I collaborated with Senior Management and Business Architecture Analysts on developing the Archives digital preservation and curation capabilities in which I put my knowledge and experience gained form the course and traineeship into practice to inform on standards, workflows and system requirements. I have just successfully delivered a digitisation project in which 5 volumes of the Minutes of the Court of Directors were digitised and will soon be made available online.

Another aspect of my role is to help with the Archive’s research service, researching and answering enquiries on the collection both from internal and external stakeholders, booking in visitors, processing researcher’s information and IDs, escorting visitors, retrieving material and liaising with the Bank’s information and Data Protection Teams. The traineeship was a solid foundation with a group of remarkable people in which I gained innumerable skills to be able to confidently undertake the tasks I do now at the Bank. I made some great friends in Oxford and I am sure we will assemble again in the future. In the meantime, I am thoroughly enjoying life at the Bank Archive.

Carl Cooper, Dec 2019

Update from Kelly Burchmore, former Graduate Trainee Digital Archivist 2017-2019

Kelly Burchmore in front of the Daniel Meadows display, Weston Library, Oct 2019

Kelly in front of the Daniel Meadows display, Weston Library, Oct 2019

In March 2019, fresh out of the traineeship, I began a 3 month Project Archivist role at Archives & Special Collections for University of Surrey. This role allowed me to both build on my cataloguing experience I had gained at the Bodleian Libraries to create a detailed catalogue of the Geraldine Stephenson archive, and to take full ownership of a project with a strict brief and deadlines. I was also able to inform cataloguing practice and processes in the department going forward, by making suggestions which were incorporated into their guidance manuals. My 3 months in Special Collections at Surrey were a great balance of experiencing how other institutions do things differently, and developing both theoretical and practical knowledge from the traineeship. One of the things I love most about working as an archivist is that you never stop learning, as the community and best practice is always developing.

My second post as a newly qualified archivist is a return to the Bodleian Libraries’ Special Collections department at the Weston Library, in the role of Newly Qualified Project Archivist. My time is split across three to four different projects, so my week tends to be really varied. For example, I have currently finished the first edition catalogue for the Archive of Daniel Meadows, photographer and social documentarist – a project which offered up various challenges, all of which I relished, including: submitting a detailed project proposal, conservation for extensive photographic material, the separation and capture of digital material and working with various stakeholders such as Daniel Meadows himself, as well as the exhibitions department. Currently I also work as a team with two digital archivist trainees on the Bodleian Libraries’ Web Archive, which means I am able to build on technical skills and applying solutions I first learned in the traineeship.

Since qualifying, the Archives and Records Association (ARA) Section for New Professionals (SfNP) is a really valuable community I have appreciated, they organise and host seminars which are a great opportunity network and meet fellow new professionals, and offer bursaries to support newly qualified archivists.

The digital archivist traineeship equipped me with the skills and attitude needed to pursue a career in archives, and I recommend it wholeheartedly. I also feel so grateful for the traineeship providing me with experience of, and exposure to, many aspects of the sector; I draw on these experiences and learning curves in my work to date.

Kelly Burchmore, Oct 2019

Update from Iram Safdar, former Graduate Trainee Digital Archivist 2017-2019

Photograph of Iram at Historic Environment Scotland, 2019.

Iram at Historic Environment Scotland, 2019.

After completing my Graduate Digital Archivist traineeship at the Bodleian Libraries, I moved back to Scotland to join Historic Environment Scotland, the lead public body established to investigate, care for and promote Scotland’s historic environment, as a Digital Archivist for the Digital Projects Team. Situated in the Archives department, the Digital Projects Team is a three-year funded project aiming to significantly increase the volume of our archival material made accessible online. My role in the team is to manage the cataloguing and ingest of the born-digital material which has been deposited with the Digital Archive. This material consists primarily of archaeological project archives, as well architectural and maritime projects, and includes photographs, computer-aided design drawings, video footage, spatial data and more. This work has helped to inform appraisal policy, archival description standards, and digital preservation workflows. I utilise the skills I developed while undertaking my traineeship and Diploma on a daily basis, to enhance the discoverability of our collections, and ensure their long-term accessibility via the implementation of a robust digital preservation infrastructure, and I look forward to seeing developments in this sector in the years to come.

Iram Safdar, Sep 2019

Update from Ben Peirson-Smith, former Graduate Trainee Digital Archivist 2017-2019

Photograph of Ben in Manchester, 2019.

Ben in Manchester, 2019.

Since completing my traineeship at the Bodleian in March 2019 I have begun working as a Digital Archivist in the Research and Knowledge Exchange Directorate at Manchester Metropolitan University (MMU). My current role has involved applying my digital archive skills to research data management. The work in this role has included developing institutional data management plans for research projects, consulting academics on which file formats to use for live research data and long-term storage, integrating the DCC lifecycle model into how data is managed at the University, customising and delivering training to academic staff, and conducting metadata audits within the various systems used by RKE systems. Alongside this, I have been involved in contributing towards MMUs REF2021 submission; this has included developing digital portfolios that evidence practice-based research and auditing the REF open access compliance of academic journals deposited into MMUs e-space digital repository.

Ben Peirson-Smith, Sep 2019

Update from Miten Mistry, former Graduate Trainee Digital Archivist 2017-2019

Photograph of Miten at the Being Human exhibition at Wellcome Collection, 2019.

Miten at the Being Human exhibition at Wellcome Collection, 2019.
Image by Steven Pocock, Wellcome Collection.

The traineeship at the Bodleian Libraries was a great experience and helped me on my way to pursuing a career in archives. Post traineeship, after applying for several jobs, I secured a position at the Wellcome Collection as a cataloguing archivist for the International Psychoanalytical Association (IPA) archive. I thought this role would be limited to only cataloguing activities, however the size, complexity and variety of Wellcome Collection has allowed me to do a lot more.

The day to day cataloguing of the IPA archive has involved all aspects of archival processing with the addition of dealing with the expected and unexpected issues that arise with the logistical challenge of a large collection. Part of my role involves regular desk duty in the Rare Materials room, the Library Enquiry desk and answering email enquires. I have also been able to undertake acquisition work with our collections development team and get involved with digital archiving with regards to processes and workflows.

I would recommend the traineeship as it exposes you to all aspects of working in a large archive. There are similarities and differences between the Bodleian Libraries and the Wellcome Collection, however the skills I gained from the traineeship have been invaluable in helping me adapt to working in a new archive and environment.

Miten Mistry, Sep 2019

Update from Rachael Gardner, former Graduate Trainee Digital Archivist 2015-2017

Photograph of Rachael in the palm house at Kew, 2019

Rachael in the palm house at Kew, 2019

Following my traineeship, I began work as a Project Cataloguer working on the Georgian Papers Programme at the Royal Archives, which is a large-scale digitisation, cataloguing and research project making all the Royal Archives’ Georgian material freely accessible online. Here I catalogued papers of George IV to item level, supported researchers, and gained experience of digitisation workflows. This built on the cataloguing and palaeography experience I had developed at the Bodleian.

I then moved to the Royal Botanic Gardens, Kew, where I am currently in the middle of a two-year project cataloguing the Miscellaneous Reports Collection, a large collection of nineteenth and twentieth century material relating to global and colonial networks of economic botany. I am responsible for the cataloguing of this complex collection, which includes correspondence, printed reports, newspaper cuttings, photographs, illustrations, and even plant specimens. My role has included implementing new cataloguing and indexing protocols, using linked data to index botanical names, managing volunteers, and promoting the collection to researchers and the public through exhibitions, group visits, social media, talks, and academic events.

I really enjoyed my traineeship and am grateful for the brilliant opportunity it gave me to gain skills in a wide variety of archive work, which has given me a useful grounding for my career so far.

Rachel Gardner, Oct 2019