Tag Archives: ArchiveIT

Reflections on Curating in the Crossfire: Collecting in the Time of War, Conflict and Crises

On 3-4 November, I attended a two-day event at the British Library that highlighted the challenges and approaches of collecting materials created during times of war, conflict and crises. Through a series of panels and discussions, museum and library professionals, researchers and private collectors shared examples of incredible historical and contemporary initiatives to preserve diverse materials and heritage sites at risk of loss, decay or destruction.

Having recently worked on the joint Bodleian Libraries and History of Science Museum Collecting COVID project, I was particularly interested in contemporary programmes of collecting. Our project, which ran from 2021-2023, aimed to acquire and preserve the University of Oxford’s research response to the COVID-19 pandemic. It enabled us to capture, catalogue and publish over ninety oral history interviews.

Modern collections/initiatives showcased included:

  • Web Archiving the COVID-19 pandemic, Nicola Bingham, British Library
  • Coastal Connections (heritage sites at threat from coastal erosion) Dr Alex Kent, World Monuments Fund)
  • Crowdsourcing photographs for the Picturing Lockdown Collection Dr Tamsin Silvey, Historic England
  • Endangered Archives Programme (recent case studies include Ukraine, Gaza and Sudan) Dr Sam van Schaik, British Library
  • Collecting Human Stories during the war in Ukraine, Natalia Yemchenko, Rinat Akhmetov Foundation/Museum of Civilian Voices

Rapid collecting is a means to collect documentary evidence, preserve cultural memories and commemorate events. By providing access to these collections, institutions are then able to build a body of evidence and facilitate research. I was struck by the similarities between modern initiatives and those that had taken place a century before. Some of the contemporary examples of collections crowdsourcing harked back to the collecting of ephemera during the First World War. Dr Ann-Marie Foster highlighted the Bond of Sacrifice Collection and Women’s Work Collection (Imperial War Museums) in her presentation with Alison Bailey, in which families sent items memorialising loved ones, as examples of early collecting initiatives. Modern rapid collecting work has meant that contemporary archivists/curators have taken up this tradition, working actively to save materials at risk of loss through intentional selection.

As well as crowdsourcing and outreach, other strategies institutions draw upon in an increasingly online world are web archiving, digitisation and digital preservation. With social media now a main mode of communication for millions, web archiving is a useful tool to preserve and present online response to global events. Work to capture websites relating to recent events is ongoing at both the Bodleian Libraries and British Library. I found Archive-It to be an incredibly useful tool to capture and publish a range of web pages (including the social media pages of COVID-19 researchers, given with permission) for our project, which without reactive selection and preservation, would otherwise have been at risk of loss.

Overall, the event highlighted that institutions must use active strategies towards preserving at-risk materials created during ongoing crises and conflicts, including:

  • Involving communities to assist in selection of materials;
  • Providing as representative a view of the event as possible (capturing diverse perspectives);
  • Providing access to collections and making them available as widely as possible (ethical considerations and sensitivities permitting);
  • Democratising collections and preserving them for future generations.

Why and how do we Quality Assure (QA) websites at the BLWA?

At the Bodleian Libraries Web Archive (BLWA), we Quality Assure (QA) every site in the web archive. This blog post aims give a brief introduction into why and how we QA. The first steps of our web archiving involve crawling a site, using the tools developed by ArchiveIT. These tools allow for entire websites to be captured and browsed using the Wayback Machine as if it were live, allowing you to download files, view videos/photos and interact with dynamic content, exactly how the website owner would want you to. However, due to the huge variety and technical complexity of websites, there is no guarantee that every capture will be successful (that is to say that all the content is captured and working as it should be). Currently there is no accurate automatic process to check this and so this is where we step in.

We want to ensure that the sites on our web archive are an accurate representation in every way. We owe this to the owners and the future users. Capturing the content is hugely important, but so too is how it looks, feels and how you interact with it, as this is a major part of the experience of using a website.

Quality assurance of a crawl involves manually checking the capture. Using the live site as a reference, we explore the archived capture, clicking on links, trying to download content or view videos; noting any major discrepancies to the live site or any other issues. Sometimes, a picture or two will be missing or, it maybe that a certain link is not resolving correctly, which can be relatively easy to fix, but other times it can be massive differences compared to the live site; so the (often long and sometimes confusing) process of solving the problem begins. Some common issue we encounter are:

  • Incorrect formatting
  • Images/video missing
  • Large file sizes
  • Crawler traps
  • Social media feeds
  • Dynamic content playback issues

There are many techniques available for us to use to help solve these problems, but there is no ‘one fix for all’, the same issue for two different sites may require two different solutions. There is a lot of trial and error involved and over the years we have gained a lot of knowledge on how to solve a variety of issues. Also ArchiveIT has a fantastic FAQ section on their site, however, if we have gone through the usual avenues and still cannot solve our problems, then our final port of call is to ask the geniuses at ArchiveIT, who are always happy and willing to help.

An example of how important and effective QA can be. The initial test capture did not have the correct formatting and was missing images. This was resolved after the QA process

QA’ing is a continual process. Websites add new content or companies change to different website designers, meaning captures of websites that have previously been successful, might suddenly have an issue. It is for this reason that every crawl is given special attention and is QA’d. QA’ing the captures before they are made available is a time consuming but incredibly important part of the web archiving process at the Bodleian Libraries Web Archive. It allows us to maintain a high standard of capture and provide an accurate representation of the website for future generations.