On 3-4 November, I attended a two-day event at the British Library that highlighted the challenges and approaches of collecting materials created during times of war, conflict and crises. Through a series of panels and discussions, museum and library professionals, researchers and private collectors shared examples of incredible historical and contemporary initiatives to preserve diverse materials and heritage sites at risk of loss, decay or destruction.
Having recently worked on the joint Bodleian Libraries and History of Science Museum Collecting COVID project, I was particularly interested in contemporary programmes of collecting. Our project, which ran from 2021-2023, aimed to acquire and preserve the University of Oxford’s research response to the COVID-19 pandemic. It enabled us to capture, catalogue and publish over ninety oral history interviews.
Modern collections/initiatives showcased included:
Web Archiving the COVID-19 pandemic, Nicola Bingham, British Library
Coastal Connections (heritage sites at threat from coastal erosion) Dr Alex Kent, World Monuments Fund)
Endangered Archives Programme (recent case studies include Ukraine, Gaza and Sudan) Dr Sam van Schaik, British Library
Collecting Human Stories during the war in Ukraine, Natalia Yemchenko, Rinat Akhmetov Foundation/Museum of Civilian Voices
Rapid collecting is a means to collect documentary evidence, preserve cultural memories and commemorate events. By providing access to these collections, institutions are then able to build a body of evidence and facilitate research. I was struck by the similarities between modern initiatives and those that had taken place a century before. Some of the contemporary examples of collections crowdsourcing harked back to the collecting of ephemera during the First World War. Dr Ann-Marie Foster highlighted the Bond of Sacrifice Collection and Women’s Work Collection (Imperial War Museums) in her presentation with Alison Bailey, in which families sent items memorialising loved ones, as examples of early collecting initiatives. Modern rapid collecting work has meant that contemporary archivists/curators have taken up this tradition, working actively to save materials at risk of loss through intentional selection.
As well as crowdsourcing and outreach, other strategies institutions draw upon in an increasingly online world are web archiving, digitisation and digital preservation. With social media now a main mode of communication for millions, web archiving is a useful tool to preserve and present online response to global events. Work to capture websites relating to recent events is ongoing at both the Bodleian Libraries and British Library. I found Archive-It to be an incredibly useful tool to capture and publish a range of web pages (including the social media pages of COVID-19 researchers, given with permission) for our project, which without reactive selection and preservation, would otherwise have been at risk of loss.
Overall, the event highlighted that institutions must use active strategies towards preserving at-risk materials created during ongoing crises and conflicts, including:
Involving communities to assist in selection of materials;
Providing as representative a view of the event as possible (capturing diverse perspectives);
Providing access to collections and making them available as widely as possible (ethical considerations and sensitivities permitting);
Democratising collections and preserving them for future generations.
The Digital Archivist Trainees had the opportunity to attend the “Copy that Floppy” workshop organised by the Cambridge Future Nostalgia team on October 9, which provided an introduction to floppy disk imaging for digital archivists and digital preservation practitioners. This blog post outlines some of the key takeaways from our experience, and a full guide to floppy disk imaging produced by Future Nostalgia can be found here.
A floppy disk is a type of media which stores data on a magnetic-coated soft plastic disk in a hard plastic case. Popular in the 1970s–1990s, floppy disks come in several sizes: 8-inch, 5.25-inch, 3.5-inch, and sometimes 3-inch. While the number of 8-inch and 5.25-inch floppy disks sold in this period remained relatively stable, the number of 3.5-inch floppy disks sold rose dramatically in the 1990s. The Future Nostalgia team predicts that there will be a significant rise in the number of 3.5-inch disks in future accessions, and therefore creating the capacity to image 3.5-inch disks in particular before this influx should be a priority.
Workstation equipment including a floppy disk drive, ribbon cable, controller, 3.5-inch high density disk, and a power cable. Photo by Leontien Talboom.
Early floppy disks came in single-sided and double-sided formats, meaning that data could be reliably written on only one or both sides of the disks. It is also important to try to identify the “density”, or the way the disk was encoded and magnetised, as this affects how the disk can be read. 3.5-inch double density disks have a hole only in one corner, whereas 3.5-inch high density disks often have two. 5.25-inch disks are more difficult to identify as double or high density, and 8-inch disks are also sometimes single density. The disk manufacturer and type of computer used to write data can also affect the way the disk can be read (e.g., Mac data can be difficult to read on a non-Mac system and vice versa). Common disk manufacturers included Apple, Amstrad, and IBM.
Floppy disk drives that are compatible with the various sizes of floppy disks can be used with a “controller” to read disks on a modern computer. A controller is a piece of hardware that manages the connection between the disk drive and the modern machine, and crucially, it can read “flux-level data” from the disk. (Some 3.5-inch disks can also be read with a USB floppy drive, but these drives cannot read flux-level data, which can help recover some information when a disk is damaged or degraded.) In the workshop, we used a “Greaseweazle”, which is the most commonly used floppy disk controller, that runs with a Python package of the same name.
In teams, we each assembled a workstation to read various sizes of floppy disks. The Future Nostalgia team provided drives, controllers, and cables, as well as some test disks and workshop participants also brought in their own disks that they had been hoping to read. Excitingly, one member of my team brought in a stack of 3-inch Amstrad floppy disks which tend to be rarer than their 3.5-inch counterparts. We used a 26- to 34-pin ribbon cable to connect the 3-inch drive to our controller and a USB-C cable to connect the controller to a PC. The Amstrad drive also required us to use a flipped power cable compatible with an Amstrad drive to connect to an external 12V power source. Luckily, the expert at our table warned us this was necessary―a regular power cable or a power connection directly to the 5V-compatible Greaseweazle would’ve fried the drive or the board!
Setting up a workstation to image 3-inch Amstrad disks.
Despite everything being connected in a way that should have worked, the Greaseweazle software returned unexpected errors when trying to read the disk. Floppy disk drives and cables are fickle and will sometimes work or not work in the same set-up―it’s worth taking things apart, putting them back together, and trying again. Eventually, we discovered that the controller was unhappy with its connection to the ribbon cable and we had to instead connect it to a different port on the same cable. When that was done, the Greaseweazle was satisfied and we were able to image some Amstrad floppy disks! The first step was to take a flux image of the disk and view it using an emulator. From this flux image we were able to tell whether the disk was damaged (fortunately it was in good shape!) and how many tracks were stored on it. We then were able to convert the raw flux image data into a disk image, and extract some of the text files saved on the disk. It turned out that the stack of 3-inch disks contained research notes and bibliographies compiled by an historian of Anglo-Saxon history from whose archive they came.
My colleague Evie’s team ran into one of the most interesting cases of the day, which amassed a small crowd of practitioners looking over her shoulder while she was imaging a disk. Curiously, the flux image kept returning data for only one side of the double-sided disk. The suspicion that we left with was that the user had first written the disk using both sides of a double-sided drive, but had later overwritten data on only one side by using a single-sided drive. Unfortunately, that meant that the oldest data was lost―but it generated a lot of speculation as to how to go about recovering as much as possible. Floppy disks are complicated, and they and the machines needed to read and write them were expensive. Users found creative ways to reuse and reformat disks, which means that sometimes manufacturers’ labels are misleading when imaging disks today. The Future Nostalgia team estimated that they have success imaging disks about 50% of the time due to degradation or damage, so it was an authentic experience not to get complete data off of all of the disks we saw.
Evie copying some floppies! Photo by Mark Box.
This workshop was a fantastic crash course into floppy disk imaging, and many thanks to the Future Nostalgia team for inviting us along!
As some of you may know, since 2011 the Bodleian has been archiving websites, which are collected in the Bodleian Libraries Web Archive (BLWA) and made publicly accessible through the platform Archive-it. BLWA is thematically organised into seven collections: Arts and Humanities; Social Sciences; Science, Technology and Medicine; International; Oxford University Colleges; Oxford Student Societies and Oxford GLAM. As their names already suggest, much of the online content we collect relates to Oxford University and seeks to provide a snapshot of its intellectual, cultural and academic life as well as to document the University’s main administrative functions.
From the very beginning, the BLWA collection has also been regarded as a complement to and reflection of the Bodleian’s analogue special collections that users can consult in the reading rooms. For example, there are multiple meaningful links between our BLWA Arts & Humanities collection and the Bodleian’s Modern Archives & Manuscripts. By teasing out the connections between them, I hope to offer some concrete examples of how archived websites can be valuable to historical and cultural research and explore some of the reasons why the BLWA can be seen as integral to the Bodleian Special Collections.
Collecting author appreciation society websites…
In BLWA, you can find websites of societies dedicated to the study of famous authors whose papers are kept at the Bodleian (partly or in full), such as T.S. Eliot, J. R.R. Tolkien and Evelyn Waugh. An example from this category is The Philip Larkin Society website, which complements the holdings of correspondence to and from the poet and librarian Philip Larkin (1922-1985) held at the Bodleian.
The website provides helpful information to anyone with a general or academic interest in Larkin, as it lists talks and events about the poet as well as relevant publications and online resources promoted by the Society.
A 2018 capture in BLWA of a webpage from the Larkin Society website, describing a public art project celebrating Larkin’s famous poem ‘Toads’
The value of the archived version of The Philip Larkin Society website may not be immediately apparent now, when the live site is still active. However, in decades from now, this website may well become a primary source that offers a window onto how early 21st century society engaged with English poetry and disseminated research about the topic through media and formats distinctive of our time, such as online reviews, podcasts and blog posts.
…and social media accounts
Alongside websites, BLWA has been actively collecting Twitter accounts pertaining to authors and artists, such as The Barbara Pym Society Twitter presence.
A 2019 capture in BLWA of the Barbara Pym Society Twitter account
The Twitter feed preserves the memory of ephemeral, but meaningful encounters and forms of engagement with the works of English novelist Barbara Pym (1913-1980). The experience of consulting the Archive of English Novelist Barbara Pym in the Weston Reading rooms is enriched by the possibility of reading through the posts on the Pym Twitter account. From talks about Pym’s work to quotes in newspaper articles mentioning the author, the Twitter feed is not only a collection of news and information about Barbara Pym’s work, but also a representation of the lively network of individuals engaging with her writings, both in academic and broader circles.
Online presence of contemporary artists
Building an online presence through social media and a personal website is a promotional strategy that many contemporary artists and authors have adopted. A good example of this is the website of the British photographer and documentarist Daniel Meadows (b. 1952). In 2019, BLWA started taking regular captures of Meadows’ website, Photobus, following the acquisition of Meadows’ Archive a year earlier. This hybrid archive (which includes both analogue and born-digital items) has since been catalogued and its finding aid is available here.
The captures taken of Meadows’ Photobus site provide us with contextual information on the photographic series described in the finding aid of Meadows’ Archive at the Bodleian. Through the website, we get an account of Meadows’ life in his own words, we learn about the exhibitions where Meadows’ photographs were displayed and find out about the books in which his work has been published.
If you were to search for Daniel Meadows’ website on the live web right now, you would find that the website is still active, but looks rather different in content and layout from the captures archived in the BLWA between 2019 and March 2023.
Comparison of the ‘About’ page on Daniel Meadows’ website: the BLWA capture from January 2023 (top), and the capture from May 2023 (bottom)
Furthermore, the URL has changed from Photobus to the name of the photographer himself. Were it not for the version of the website archived in BLWA, the old content and structure of the site would not be as easily accessible. The website has also changed in scope, as it now provides us with a comprehensive digital repository of Meadows’ photographic series.
Comparing Meadows’ website in BLWA with his archive at the Bodleian, we can see an interesting series of correspondences between digital and analogue realm, and between digital and physical archives. For example, the archived version of Meadows’ website Photobus is included as a link in the section of the finding aid for the Meadows archive devoted to ‘related materials’. In turn, the updated, 2023 version of Meadows’ site reflects in some respects the organisation and structure of an archive: his oeuvre is tidily arranged into series, each accompanied by a description and digital images of the photographs to match their arrangement in the physical archive at the Bodleian. Daniel Meadows’ new website exemplifies how, through the combination of metadata and high-resolution images, websites can become a powerful interface through which an archive is discovered and its contents accessed in ways that complement and enhance the experience of working through an archival box in a reading room.
Archived websites as a link to tomorrow’s archives
Web archives are a relatively recent phenomenon, so the uses of a collection of archived websites like the BLWA are only gradually beginning to emerge. The historical, cultural and evidential value of web archives is still overlooked, or perhaps just not yet fully exploited. It is only a matter of time before social media and websites like those kept in BLWA will be seen as an increasingly important resource on the cultural significance of 20th and 21st century authors and artists and the reception of their work. After all, for today’s authors and artists, social media and websites are an important vehicle for the dissemination of news about their work, of their opinions and creativity. As such, their online presence may be different in form, but similar in purpose and significance to the letters, pamphlets, alba amicorum and diaries that one would consult to research the social interactions, ideas, and activities of a humanist scholar.
One of the exciting aspects of working with digital archives is the proactive nature of our collecting practice. Curators of digital collections need to identify, select and collect relevant content before it disappears or decay – threats to which websites and social media are vulnerable. Through the choices we make today of content to archive, we are ultimately shaping the digital archives that will be accessible decades from now.
We are happy to consider suggestions from our users about websites that could be suitable additions to the collection. If you are curious to explore the BLWA collection further, you can find it here. The online nomination form can be found at this link. So don’t just follow the links – help us save them!
You may or may not know that as well as the physical tangible treasures in our Special Collections, Archives and Modern Manuscripts are also home to born-digital archives which are stored, processed and managed through our digital repository, Bodleian Electronic Archives and Manuscripts (BEAM). In the past few years, the Bodleian Libraries have accessioned and processed a number of oral history collections, which are rich resources of spoken memory.
What kinds of oral histories do the Bodleian Libraries hold in Special Collections?
The development of medical history both locally and nationally is reflected in the holdings of Sir William Dunn School of Pathology oral histories and Recollecting Oxford Medicine: Oral Histories. Recollecting Oxford Medicine is a project funded and facilitated by Oxford Medical Alumni and generous private donors. The archive of their oral histories augments our current physical holdings on Oxford medics and medicine, by setting out to question and listen to a large range of interviewees across various departments, divisions and disciplines whose work also spanned different periods from the Second World War until the current day. Recollecting Oxford Medicine makes for a fascinating account of the development and changes of the Oxford Medical School and the Oxford Hospitals from the memories of those at the forefront.
Series of publicly accessible ROM interview recordings, hosted on University of Oxford Podcasts.
List of some of the ROM interviews available as podcast episodes through the Recollecting Oxford Medicine series. Episodes currently number 51.
Since the latter part of the twentieth century, oral history projects have consciously sought fill gaps in collective history by interviewing subjects and collecting testimonies from those who may have been excluded from participation. Oxford Women in Computing: an Oral History project is one example of this practice and a recurring theme in the oral history interviews is gender splits in computing which interviewees perceived and experienced. These oral history interviews, conducted by Georgina Ferry, capture the stories and memories of pioneering women at the forefront of computing and its teaching, and in research and service provision at Oxford from the 1950s-1990s. The series of publicly accessible interviews can be found here.
Oral Histories and Archives
Processing oral history collections which are kindly donated or transferred gives the opportunity to train and utilise new skills urgently needed to preserve the authenticity and significant components of, and manage, the born-digital records of these projects. These include learning to use editing software to edit mp3 derivatives of master wav. audio recordings as a means to comply with UK data protection legislation when creating public access versions of recordings. Part of the work flow of managing and making these oral histories available has also included mapping metadata such as indexed names and subjects between BEAM documentation to our cataloguing system Bodleian Archives & Manuscripts, to the back end of the publication portal for University of Oxford podcasts, where the publicly accessible oral history recordings are currently hosted.
Oral Histories are recognised as multi-faceted and valuable educational and research tools. These oral histories held in Special Collections are for everyone; whether a subject specialist, a multidisciplinary, an inquisitive Oxford resident or university member… or just anyone curious who fancies learning about something new! University of Oxford podcasts can be accessed for free anywhere online on the web in the links given above, and also through Apple podcasts.
Watch this space for updates on any new acquisitions or newly catalogued oral history projects.
There were three themes to this year’s conference: sustainability, diversity, and advocacy. Though each day of the conference covered one theme, one of the stand-outs of the conference was just how interlinked all three strands were.
Day one’s keynote speaker was Jeff James, Chief Executive and Keeper at The National Archives. Jeff talked about environmental sustainability, as well as the sustainability of the record and of the archives sector. He mentioned how The National Archives at Kew are committed to lowering their carbon footprint, which has been reduced by 80% since 2009. This has been achieved by building on scientific research with regards to buildings, bringing both a financial and environmental benefit. He also spoke of records at risk, referring to the work of the Cultural Recovery Fund, the Covid-19 Archives Fund for records at risk and the Crisis Management Team alongside already established fund streams such as the Archives Revealed grant scheme. Digital records were flagged as records at risk and he stressed the need for the sector to work in partnership and collaboration, both together and with digital giants (such as Microsoft and Google) with regards to developing digital products. Sector skills include the need for records professionals to gain digital skills through schemes and strategies such as Plugged In Powered Up, the Novice to Know-How online training resource created by the Digital Preservation Coalition, the Digital Archives Learning Exchange, and the Bridging the Gap traineeship programme.
The fragility of born-digital records, identified as critically endangered by the Digital Preservation Coalition, was a common theme throughout the conference. Even the most modern of records are at risk (CD-Rs for example, have a lifespan of under 10 years). Particular digital records discussed related to oral history interviews, often seen as ‘history from below’, recording the lives of those with ‘hidden histories’ off mainstream records, such as women and members of the LGBTQ+ community. Challenges to preserve digital material include cost, knowledge, skills and training, technology, and resources, as well as issues surrounding ‘gatekeeping’ and access to material. Rachel MacGregor (Digital Preservation Officer at The Modern Records Centre, University of Warwick) emphasised the need to record, describe, and catalogue born digital collections well in order to ensure that that they can be utilised by researchers, and explored some of the standards and guidance currently available.
Day two’s keynote speaker was Arike Oke (Managing Director, Black Cultural Archives) who spoke about experiences with diversity, aptly described as the equitable and mindful bringing together of difference; diversity should not be seen as static, but as a perpetual movement, both including and evolving difference. In her talk, Arike raised the point of classifying and being classified, and several sessions across the three days referred to how language and terminology impacted the use of records or archives created by or for particular communities. The use of historic terminology can be a barrier to access, particularly when words hold negative connotations that can cause distress to users. This was explored in several sessions in relation to LGBTQ+ related records and archives (including those kept at the Parliamentary Archives of the UK Parliament), as well as colonial collections such as the Miscellaneous Reports Collection held by the Royal Botanic Gardens in Kew. Thoughts on how to address the issues included guides or notes explaining the context and why such words were used, including modern terms or names in brackets, inviting feedback, and for events, giving participants time and space to process information.
The importance of being open to keeping more ephemeral material and objects (e.g. pin badges, leaflets and posters) was also highlighted, particularly in shedding light on lives not necessarily recorded in more traditional forms. Christopher Hilton of Britten Pears Arts gave an interesting presentation on the multitude of receipts kept by Benjamin Britten and his partner Peter Pears for tax purposes. The receipts were important in shedding light on their relationship by providing evidence that they maintained clearly separate financial lives, demonstrating how important it was for their professional lives at that period that their records could be used to demonstrate a ‘plausible deniability’ should their personal relationship be questioned. The receipts were also records of businesses in Aldeburgh which are now long gone, provoking memories for older residents and providing a tangible link between the archive and the town.
Day three’s keynote speaker was Deirdre McParland, Senior Archivist at the Electricity Supply Board (Ireland) whose inspirational talk focussed on the importance of advocacy and that ‘archives are for life, not just anniversaries’. Deirdre spoke of how archives should be pro-active and innovative when it comes to advocacy, and that projects should be strategically planned to include promotion as standard. Deirdre’s talk was followed by a talk by Jenny Moran and Robin Jenkins from the Record Office for Leicestershire, Leicester and Rutland, and Richard Wiltshire of the Crisis Management Team. Jenny, Robin and Richard talked about saving the archive of the travel firm Thomas Cook after the company’s sudden collapse: an excellent example of how swift action, negotiation and successful advocacy led to the ensured survival of the archive. The conference was nicely brought to a close by a talk by Alan and Bethan Ward on their project Photographs from Another Place. Their talk, given from the perspective of the archive user, showed how a bit of archival research revealed the names and stories behind a group of forgotten and unlabelled glass plate negatives. It was, for me at least, a timely reminder of the enduring value of archives.
A selection of further reading recommendations made by speakers and participants:
This year’s International Internet Preservation Consortium Web Archiving Conference was held online from 15-16th June 2021, bringing together professionals from around the world to share their experiences of preserving the Web as a research tool for future generations. In this blog post, Simon Mackley reports back on some of the highlights from the conference.
How can we best preserve the World Wide Web for future researchers, and how can we best provide access to our collections? These were the questions that were at the forefront of this year’s International Internet Preservation Consortium Web Archiving Conference, which was hosted virtually by the National Library of Luxembourg. Web archiving is a subject of particular interest to me: as one of the Bodleian Library’s Graduate Trainee Digital Archivists, I spend a lot of my time working on our own Web collections as part of the Bodleian Libraries Web Archive. It was great therefore to have the chance to attend part of this virtual conference and hear for myself about new developments in the sector.
One thing that really struck me from the conference was the huge diversity in approaches to preserving the Web. On the one hand, many of the papers concerned large-scale efforts by national legal deposit institutions. For instance, Ivo Branco, Ricardo Basílio, and Daniel Gomes gave a very interesting presentation on the creation of the 2019 European Parliamentary Elections collection at the Portuguese Web Archive. This was a highly ambitious project, with the aim of crawling not just the Portuguese Web domain but also capturing a snapshot of elections coverage across 24 different European languages through the use of an automated search engine and a range of web crawler technologies (see their blog for more details). The World Wide Web is perhaps the ultimate example of an international information resource, so it is brilliant to see web archiving initiatives take a similarly international approach.
At the other end of the scale, Hélène Brousseau gave a fascinating paper on community-based web archiving at Artexte library and research centre, Canada. Within the arts community, websites often function as digital publications analogous to traditional exhibition catalogues. Brousseau emphasised the need for manual web archiving rather than automated crawling as a means of capturing the full content and functionality of these digital publications, and at Artexete this has been achieved by training website creators to self-archive their own websites using Conifer. Given that in many cases web archivists often have minimal or even no contact with website creators, it was fascinating to hear of an approach that places creators at the very heart of the process.
It was also really interesting to hear about the innovative new ways that web archives were engaging with researchers using their collections, particularly in the use of new ‘Labs’-style approaches. Marie Carlin and Dorothée Benhamou-Suesser for instance reported on the new services being planned for researchers at the Bibliothèque nationale de France Data Lab, including a crawl-on-demand service and the provision of web archive datasets. New methodologies are always being developed within the Digital Humanities, and so it is vitally important that web archives are able to meet the evolving needs of researchers.
Like all good conferences, the papers and discussions did not solely focus on the successes of the past year, but also explored the continued challenges of web archiving and how they can be addressed. Web archiving is often a resource-intensive activity, which can prove a significant challenge for collecting institutions. This was a major point of discussion in the panel session on web archiving the coronavirus pandemic, as institutions had to balance the urgency of quickly capturing web content during a fast-evolving crisis against the need to manage resources for the longer-term, as it became apparent that the pandemic would last months rather than weeks. It was clear from the speakers that no two institutions had approached documenting the pandemic in quite the same way, but nonetheless some very useful general lessons were drawn from the experiences, particularly about the need to clearly define collection scope and goals at the start of any collecting project dealing with rapidly changing events.
The question of access presents an even greater challenge. We ultimately work to preserve the Web so that researchers can make use of it, but as a sector we face significant barriers in delivering this goal. The larger legal deposit collections, for instance, can often only be consulted in the physical reading rooms of their collecting libraries. In his opening address to the conference, Claude D. Conter of the National Library of Luxembourg addressed this problem head-on, calling for copyright reform in order to meet reader expectations of access.
Yet although these challenges may be significant, I have no doubt from the range of new and innovative approaches showcased at this conference that the web archiving sector will be able to overcome them. I am delighted to have had the chance to attend the conference, and I cannot wait to see how some of the projects presented continue to develop in the years to come.
On Wednesday 19th November I attended the UK Web Archive (UKWA) mini-conference 2020, my first conference as a Graduate Trainee Digital Archivist. It was hosted by Jason Webber, Engagement Manager at the UKWA and, as normal in these COVID times, it was hosted on Zoom (my first ever Zoom experience!)
The conference started with an introduction and demonstration of the UKWA by Jason Webber. Starting in 2005 the UKWA’s mission is to collect the entire UK webspace, at least once per year, and preserve the websites for future generations. As part of my traineeship I have used the UKWA but it was interesting to hear about the other functions and collections it provides. Along with being able to browse different versions of UK websites it also includes over 100 curated collections on themes ranging from Food to Brexit to Online Enthusiast Communities in the UK. It also features the SHINE tool, which was developed as part of the ‘Big UK Data Arts and Humanities’ project and contains over 3.5 billion items which have been full-text indexed so that every word is searchable. It allows users to perform searches and trend analysis on subjects over a huge range of websites, all you need to use this tool is a bit a Python knowledge. My Python knowledge is a bit basic but Caio Mello, during his researcher talk, provided a useful link for online python tutorials aimed at historians to aid in their research.
In his talk, Caio Mello (School of Advanced Study, University of London) discussed how he used the SHINE tool as part of his work for the CLEOPATRA Project. He was specifically looking at the Olympic legacy of the 2012 Olympics, how it was defined and how the view of the legacy changed over time. He explained the process he used to extract the information and the ways the information can be used for analysis, visualisation and context. My background is in mathematics and the concept of ‘Big Data’ came up frequently during my studies so it was fascinating to see how it can be used in a research project and how the UKWA is enabling research to be conducted over such a wide range of subjects.
The next researcher talk by Liam Markey (University of Liverpool and the British Library) showed a different approach to using the UKWA for his research project into how Remembrance in 20th Century Britain has changed. He explained how he conducted an analysis of archived newspaper articles, using specific search terms, to identify articles that focused on commemoration which he could then use to examine how the attitudes changed over time. The UKWA enabled him to find websites that focused on the war and compare these with mainstream newspapers to see how these differ.
The Keynote speaker was Paul Gooding (University of Glasgow) and was about the use and users of Non-Print Legal Deposit Libraries. His research as part of the Digital Library Futures Project, with the Bodleian Libraries and Cambridge University Library as case study partners, looked at how Academic Deposit libraries were impacted by e-Legal Deposit. It was an interesting discussion around some of the issues of the system, such as balancing the commercial rights with access for users and how highly restrictive access conditions are at odds with more recent legislation, such as the provision for disabled users and 2014 copyright exception for data and text mining for non-commercial uses.
Being new to the digital archiving world, my first conference was a great introduction to web archiving and provided context to the work I am doing. Thank you to the organisers and speakers for giving me insight into a few of the different ways the web archive is used and I have come away with a greater understanding of the scope and importance of digital archiving (as well as a list of blog posts and tutorials to delve into!)
This year the annual iPRES digital preservation conference was understandably postponed and in its place the community hosted a 3-day Zoom conference called #WeMissiPRES. As two of the Bodleian Libraries’ Graduate Trainee Digital Archivists, Simon and I were in attendance and blogged about our experiences. This post contains some of my highlights.
The conference kicked off with a keynote by Geert Lovink. Geert is the founding director of the Institute of Network Cultures and the author of several books on critical Internet studies. His talk was wide-ranging and covered topics from the rise of so-called ‘Zoom fatigue’ (I guarantee you know this feeling by now) to how social media platforms affect all aspects of contemporary life, often in negative ways. Geert highlighted the importance of preserving social media in order to allow future generations to be able to understand the present historical moment. However, this is a complicated area of digital preservation because archiving social media presents a host of ethical and technical challenges. For instance, how do we accurately capture the experience of using social media when the content displayed to you is largely dictated by an algorithm that is not made public for us to replicate?
After the keynote I attended a series of talks about the ARCHIVER project. João Fernandes from CERN explained that the goal of this project is to improve archiving and digital preservation services for scientific and research data. Preservation solutions for this type of data need to be cost-effective, scalable, and capable of ingesting amounts of data within the petabyte range. There were several further talks from companies who are submitting to the design phase of this project, including Matthew Addis from Arkivum. Matthew’s talk focused on the ways that digital preservation can be conducted on the industrial scale required to meet the brief and explained that Arkivum is collaborating with Google to achieve this, because Google’s cloud infrastructure can be leveraged for petabyte-scale storage. He also noted that while the marriage of preserved content with robust metadata is important in any digital preservation context, it is essential for repositories dealing with very complex scientific data.
In the afternoon I attended a range of talks that addressed new standards and technologies in digital preservation. Linas Cepinskas (Data Archiving and Networked Services (DANS)) spoke about a self-assessment tool for the FAIR principles, which is designed to assess whether data is Findable, Accessible, Interoperable and Reusable. Later, Barbara Sierman (DigitalPreservation.nl) and Ingrid Dillo (DANS) spoke about TRUST, a new set of guiding principles that are designed to map well with FAIR and assess the reliability of data repositories. Antonio Guillermo Martinez (LIBNOVA) gave a talk about his research into Artificial Intelligence and machine learning applied to digital preservation. Through case studies, he identified that AI is especially good at tasks such as anomaly detection and automatic metadata generation. However, he found that regardless of how well the AI performs, it needs to generate better explanations for its decisions, because it’s hard for human beings to build trust in automated decisions that we find opaque.
Paul Stokes from Jisc3C gave a talk on calculating the carbon costs of digital curation and unfortunately concluded that not much research has been done in this area. The need to improve the environmental sustainability of all human activity could not be more pressing and digital preservation is no exception, as approximately 3% of the world’s electricity is used by data centres. Paul also offered the statistic that enough power is consumed by data centres worldwide to boil 10,400,000,000,000,000 kettles – which is the most important digital preservation metric I can think of.
This conference was challenging and eye-opening because it gave me an insight into (complicated!) areas of digital preservation that I was not familiar with, particularly surrounding the challenges of preserving large quantities of scientific and research data. I’m very grateful to the speakers for sharing their research and to the organisers, who did a fantastic job of bringing the community together to bridge the gap between 2019 and 2021!
Every year, the international digital preservation community meets for the iPRES conference, an opportunity for practitioners to exchange knowledge and showcase the latest developments in the field. With the 2020 conference unable to take place due to the global pandemic, digital preservation professionals instead gathered online for #WeMissiPRES to ensure that the global community remained connected. Our graduate trainee digital archivist Simon Mackley attended the first day of the event; in this blog post he reflects on some of the highlights of the talks and what they tell us about the state of the field.
How do you keep the global digital preservation community connected when international conferences are not possible? This was the challenge faced by the organisers of #WeMissIPres, a three-day online event hosted by the Digital Preservation Coalition. Conceived as a festival of digital preservation, the aim was not to try and replicate the regular iPRES conference in an online format, but instead to serve as a bridge for the digital preservation community, connecting the efforts of 2019 with the plans for 2021.
As might be expected, the impact of the pandemic loomed large in many of the talks. Caylin Smith (Cambridge University Library) and Sara Day Thomson (University of Edinburgh) for instance gave a fascinating paper on the challenge of rapidly collecting institutional responses to coronavirus, focusing on the development of new workflows and streamlined processes. The difficulties of working from home, the requirements of remote access to resources, and the need to move training online likewise proved to be recurrent themes throughout the day. As someone whose own experience of digital preservation has been heavily shaped by the pandemic (I began my traineeship at the start of lockdown!) it was really useful to hear how colleagues in other institutions have risen to these challenges.
I was also struck by the different ways in which responses to the crisis have strengthened digital preservation efforts. Lynn Bruce and Eve Wright (National Records of Scotland) noted for instance that the experience of the pandemic has led to increased appreciation of the value of web-archiving from stakeholders, as the need to capture rapidly-changing content has become more apparent. Similarly, Natalie Harrower (Digital Repository of Ireland) made the excellent point that the crisis had not only highlighted the urgent need for the sharing of medical research data, but also the need to preserve it: Coronavirus data may one day prove essential to fighting a future pandemic, and so there is therefore a moral imperative for us to ensure that it is preserved.
As our keynote speaker Geert Lovink (Institute of Network Cultures) reminded us, the events of the past year have been momentous quite apart from the pandemic, with issues such as the distorting impacts of social media on society, the climate emergency, and global demands for racial justice all having risen to the forefront of society. It was great therefore to see the role of digital preservation in these challenges being addressed in many of the panel sessions. A personal highlight for me was the presentation by Daniel Steinmeier (KB National Library of the Netherlands) on diversity and digital preservation. Steinmeier stressed that in order for diversity efforts to be successful, institutions needed to commit to continuing programmes of inclusion rather than one-off actions, with the communities concerned actively included in the archiving process.
So what challenges can we expect from the year ahead? Perhaps more than ever, this year this has been a difficult question to answer. Nonetheless, a key theme that struck me from many of the discussions was that the growing challenge of archiving social media platforms was matched only by the increasing need to preserve the content hosted on them. As Zefi Kavvadia (International Institute of Social History) noted, many social media platforms actively resist archiving; even when preservation is possible, curators are faced with a dilemma between capturing user experiences and capturing platform data. Navigating this challenge will surely be a major priority for the profession going forward.
While perhaps no substitute for meeting in person, #WeMissiPRES nonetheless succeeded in bringing the international digital preservation community together in a shared celebration of the progress being made in the field, successfully bridging the gap between 2019 and 2021, and laying the foundations for next year’s conference.
#WeMissiPRES was held online from 22nd-24th September 2020. For more information, and for recordings of the talks and panel sessions, see the event page on the DPC website.
Born digital material from the Recollecting Oxford Medicine oral history project has been donated to the Weston Library since the early 2010s, and the project is still active today with further interviews planned. A selection of interviews from the project are now available to listen to online, via University of Oxford podcasts.
The Recollecting Oxford Medicine oral history project comprises interviews with Oxford medics, which provide individual perspectives of both pre clinical and clinical courses at the Oxford Medical School, medical careers in Oxford and other locations, and give an insight into the evolution of clinical medicine at Oxford since the mid 1940s.
The interviewees have worked in a range of specialisms and departments including psychiatry, neurology, endocrinology and dermatology to name a few. Episode 18 comprises an interview with John Ledingham, former Director of Clinical Studies (a position he held twice!), recorded by Peggy Frith and Rosie Fitzherbert Jones in 2012. In episodes 11-12 we can learn about Chris Winearls – a self proclaimed ‘accidental Rhodes Scholar’ from medical school in Cape Town – his journey into nephrology and how he later became Associate Professor of Medicine for the university.
Listen to the Recollecting Oxford Medicine oral history podcast series online at https://podcasts.ox.ac.uk/series/recollecting-oxford-medicine-oral-histories
In episode 1, John Spalding, interviewed by John Oxbury in 2011, discusses working under Hugh Cairns, firstly as a student houseman at the Radcliffe Infirmary during the second world war. Spalding also recounts his experience of the initial conception of the East Radcliffe Ventilator, first being devised for use in treatment of Polio. In episode 13 we can listen to Derek Hockaday’s interview with Joan Trowell, former Deputy Director of Clinical Studies for Oxford Medical School, which amongst other topics covers her experience of roles held at the General Medical Council.
The majority of the interviews were undertaken by Derek Hockaday, former Oxford hospitals consultant physician and Emeritus Fellow of Brasenose College. The cataloguing and preservation of the oral history project is supported by Oxford Medical Alumni. The library acknowledges the donations of material and financial support by Derek Hockaday and OMA respectively.