Category Archives: Medical

Reflections on Curating in the Crossfire: Collecting in the Time of War, Conflict and Crises

On 3-4 November, I attended a two-day event at the British Library that highlighted the challenges and approaches of collecting materials created during times of war, conflict and crises. Through a series of panels and discussions, museum and library professionals, researchers and private collectors shared examples of incredible historical and contemporary initiatives to preserve diverse materials and heritage sites at risk of loss, decay or destruction.

Having recently worked on the joint Bodleian Libraries and History of Science Museum Collecting COVID project, I was particularly interested in contemporary programmes of collecting. Our project, which ran from 2021-2023, aimed to acquire and preserve the University of Oxford’s research response to the COVID-19 pandemic. It enabled us to capture, catalogue and publish over ninety oral history interviews.

Modern collections/initiatives showcased included:

  • Web Archiving the COVID-19 pandemic, Nicola Bingham, British Library
  • Coastal Connections (heritage sites at threat from coastal erosion) Dr Alex Kent, World Monuments Fund)
  • Crowdsourcing photographs for the Picturing Lockdown Collection Dr Tamsin Silvey, Historic England
  • Endangered Archives Programme (recent case studies include Ukraine, Gaza and Sudan) Dr Sam van Schaik, British Library
  • Collecting Human Stories during the war in Ukraine, Natalia Yemchenko, Rinat Akhmetov Foundation/Museum of Civilian Voices

Rapid collecting is a means to collect documentary evidence, preserve cultural memories and commemorate events. By providing access to these collections, institutions are then able to build a body of evidence and facilitate research. I was struck by the similarities between modern initiatives and those that had taken place a century before. Some of the contemporary examples of collections crowdsourcing harked back to the collecting of ephemera during the First World War. Dr Ann-Marie Foster highlighted the Bond of Sacrifice Collection and Women’s Work Collection (Imperial War Museums) in her presentation with Alison Bailey, in which families sent items memorialising loved ones, as examples of early collecting initiatives. Modern rapid collecting work has meant that contemporary archivists/curators have taken up this tradition, working actively to save materials at risk of loss through intentional selection.

As well as crowdsourcing and outreach, other strategies institutions draw upon in an increasingly online world are web archiving, digitisation and digital preservation. With social media now a main mode of communication for millions, web archiving is a useful tool to preserve and present online response to global events. Work to capture websites relating to recent events is ongoing at both the Bodleian Libraries and British Library. I found Archive-It to be an incredibly useful tool to capture and publish a range of web pages (including the social media pages of COVID-19 researchers, given with permission) for our project, which without reactive selection and preservation, would otherwise have been at risk of loss.

Overall, the event highlighted that institutions must use active strategies towards preserving at-risk materials created during ongoing crises and conflicts, including:

  • Involving communities to assist in selection of materials;
  • Providing as representative a view of the event as possible (capturing diverse perspectives);
  • Providing access to collections and making them available as widely as possible (ethical considerations and sensitivities permitting);
  • Democratising collections and preserving them for future generations.

Research using social media data: Use Cases (3/3)

The Algorithmic Archive project is a one year project funded by the Mellon Foundation. As part of the first Work Package, we explored how researchers from different disciplines use social media data to answer various research questions.

This post is the third in a three-part series presenting use cases drawn from research conducted as part of the Algorithmic Archive project.

We would like to thank the researchers who generously shared insights from their work.


Use Case – Study on the trustworthiness of social media visual content among young adults (TRAVIS project)[1]

Research questions and aim(s):

Trust And Visuality: Everyday digital practices (TRAVIS) is an ESRC project which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme. This research project that looks at how young adults experience, build and express trust in news and social media images related to wellbeing and health. It explores how and why people trust some visuals over others and how content creators establish trustworthiness through visual content. The TRAVIS project involves cross-national collaboration of multiple research teams located at different universities in UK and Europe. This includes the University of Oxford, in particular the Oxford team is based School of Geography and the Environment.

Social media data used:

The project included data collected indirectly from platforms including Facebook, Instagram, TikTok and YouTube (see below).

Tools and methods adopted:

Data collection from social media consisted of screenshots taken from the devices of interviewed young adults, as the TRAVIS project investigates the meaning of social media posts (visual content) via interviews with young adult users. The datasets generated from this method of collection counts around 400 screenshots, stored on an institutional cloud drive, which is accessible by the whole team.


[1] Further information about the TRAVIS project are available here: https://www.tlu.ee/en/bfm/researchmedit/trust-and-visuality-everyday-digital-practices-travis

Research using social media data: Use Cases (2/3)

The Algorithmic Archive project is a one year project funded by the Mellon Foundation. As part of the first Work Package, we explored how researchers from different disciplines use social media data to answer various research questions.

This post is the second in a three-part series presenting use cases drawn from research conducted as part of the Algorithmic Archive project.

We would like to thank the researchers who generously shared insights from their work.


Use Case – Exploring Algorithmic Mediation and Recommendation Systems on YouTube [1]

Research questions and aim(s):

The study sought to investigate how the YouTube platform operates, focusing on algorithmic activity and the strategies employed by both human and automated (robot) actors within federal and regional elections. The aim was to understand the impact that this system of mediation has on society and to demystify preconceptions of ideologically neutral technologies in highly disputed political events. The research focuses on two case studies: 1) the 2018 Ontario (Canada) election and 2) the 2018 Brazilian Federal Election. The data collection was carried out during the campaigning periods, between May and June in Ontario, and between August and October 2018 in Brazil.

Social media data used:

The research focussed on the sole YouTube platform. Specifically, the researchers collected information about recommended videos starting from specific keywords related to the election campaign.

Tools and methods adopted:

The data collection was carried out using a Python script developed by the Algo Transparency project. The script automates YouTube search operations based on specified keywords (e.g., the names of the candidates), allowing the researcher to gather video-related data and the relative ranking position displayed to the user. Once the keywords were defined, the tool retrieved links for the top four results for each keyword and then examined the recommendation section. This process was repeated four times, each time collecting recommended videos, simulating a user interacting with algorithmic suggestions.

Data collected was stored on personal devices and the institutional cloud, and can be visualized at the following links:


[1] Reis, R., Zanetti, D., & Frizzera, L. (2020). A conveniência dos algoritmos: o papel do YouTube nas eleições brasileiras de 2018. Compolítica10(1), 35–58. https://doi.org/10.21878/compolitica.2020.10.1.333

Research using social media data: Use Cases (1/3)

The Algorithmic Archive project is a one year project funded by the Mellon Foundation. As part of the first Work Package, we explored how researchers from different disciplines use social media data to answer various research questions.

This post is the first in a three-part series presenting use cases drawn from research conducted as part of the Algorithmic Archive project.

We would like to thank the researchers who generously shared insights from their work.


Use Case: Network/cluster analysis to investigate the construction and influence of information trustworthiness within social movements on Twitter [1]

Research questions and aim(s):

The researcher wanted to explore the construction and influence of information trustworthiness within social media movements in the context of the Hong Kong protests and the #BlackLivesMatter movements. Social media platforms offer a digital space for social movements to facilitate the diffusion of critical information and the formation of networks, coordinating protests and reach a wider audience.

Social media data used:

This study focused on Twitter as it was used evenly by both social movements, and the researcher already had an established presence on this platform. Also, at the time of data collection (2020-2021), access to Twitter data for academic research was still relatively open to researchers.

For the purpose of this study, the researcher examined the follow and followers’ relationship of top accounts counting millions of followers that had been selected as big information disseminators, including organisations, individuals or accounts serving a particular niche or purpose.

Data collection was conducted at a specific point in time in 2021. Social media data quantitative analysis (e.g. cluster analysis) was complemented with qualitative data collected via an online survey.

Tools and methods adopted:

The researcher requested and obtained access to the Twitter API. However, high-level coding skills were required to access the data, which the researcher did not have at that time due to their predominantly qualitative research background. To address this, the researcher found and used a Go script called Nucoll[2], which is freely available on GitHub and enabled the researcher to collect the required data. Nucoll is a command-line tool that, according to its developer, retrieves data from Twitter using keyword instructions, for which the developer provided example queries and brief explanations. For each social movement, the researcher selected three organisations: one large organisation, one activist group, and one additional account that was relevant to the movement. Once these accounts were selected, they were processed through the script to capture all following/follower relationships and combine them into a graph for each protest analysed. Further data visualisation and analysis — including clustering and network analysis — were conducted using Gephi.


[1] Charlotte Im, The Construction and Influence of Information Trustworthiness in Social Movements, Doctoral Thesis, University College London (UCL), 2024.

[2] https://github.com/jdevoo/nucoll

The Algorithmic Archive: a project overview

What is the Algorithmic Archive Project?

In 2024, the Algorithmic Archive Project has received funding from the Mellon Foundation to carry out scoping research that will ultimately support the Bodleian Libraries in the development of a lasting, interoperable infrastructure and sustainable strategies for archiving web-based data, including social media data and algorithms. The project is part of the broader Future Bodleian programme aiming to expand and evolve its centuries-old role by engaging with the digital domain.

Why archive social media data?

In the past two decades, social media platforms have become a central means of communication, enabling people from across the globe to engage in discussions that transcend geographical borders, reflect on contemporary events and contribute to collective memory. Given their profound impact on society, researchers across various disciplines increasingly rely on social media data to analyse social, economic, and political phenomena. However, social media data is inherently ephemeral, subject to continuous evolution driven by changes in platform leadership, economic gain, and shifting policies. For this reason, it is essential to preserve and provide reliable and sustainable access for the (re)use of such an important resource.

Steps towards the development of a social media and algorithmic data service.

The Algorithmic Archive project is articulated in four interconnected phases aimed to investigate the research, archiving, legal and technical landscape to inform the Bodleian Libraries’ future development of a social and algorithmic data service.

The image below offers a visual summary of the work packages that the Research Officers have been exploring over this one-year project.

In upcoming blog posts, we will present some of the results and highlight use cases drawn from research conducted with social media data.

Reporting from the Born-Digital Collections, Archives and Memory Conference 2025

Between 2-4 April 2025, I attended the very first edition of the Born-digital Collections, Archives and Memory conference, together with my colleague from the Algorithmic Archive Project, Pierre Marshall. The conference was co-organised by the School of Advanced Study at the University of London, the Endangered Material Knowledge Programme at The British Museum, The British Library and Aarhus University. This international event offered the unique opportunity to bring together academics and practitioners from diverse disciplines, career paths and backgrounds to explore the transformative impact of born-digital cultural heritage. The diverse range of research, methodologies, and practices presented in this year’s programme offered valuable insights and reflections, particularly relevant to the Algorithmic Archive project and its goal of developing sustainable, persistent approaches to preserving born-digital heritage created on the web, especially on social media platforms.

The inspiring opening keynote by Dorothy Berry, Digital Curator at the Smithsonian National Museum of African American History and Culture, highlighted the vital importance of preserving ephemeral and fragile forms of born-digital heritage (such as social media) —many of which have increasingly replaced traditional modes of memory-making, also drawing attention to the pressing need for a deeper understanding of what and how born-digital memory should be preserved. In particular, she stressed the need to record the “full context” in which born-digital records and materials were embedded before being collected and included in specific collections. However, she also highlighted the challenges many memory institutions face due to uneven resource distribution, an issue that may hinders both the development and long-term sustainability of innovative preservation efforts.

Given the richness of the BDCAM25 program, it is incredibly difficult to summarise the many takeaways from the three-day conference. Nevertheless, it is worth highlighting sessions such as the one exploring the history, socio-technical dynamics and research conducted on corpora from platforms such as Usenet; the important reflections stemmed from a study conducted by Rosario Rogel-Salazar and Alan Colín-Arce exploring the presence of feminist organisations in web archives; and the research conducted by Dr Andrea Stanton exploring Palestine and the concept of Palestinian heritage through the analysis of accounts and hashtags on Instagram. 

Particularly valuable insights came also from Dr Kieran Hegarty’s paper, which explored the challenges posed by the unpredictable and frequent changes to platform design and policies, underscoring how this significantly influence what is included in web archives and how the material is made available.

Beveridge Hall entrance, Senate House, University of London. Photo taken by B. Cannelli

Overall, the conference provided a valuable opportunity to learn about new research and to network with scholars and practitioners from around the globe. During lunch and coffee breaks, I had insightful conversations with several delegates about the challenges of preserving born-digital materials, particularly data generated on social media platforms. We exchanged ideas and reinforced the importance of developing shared practices to safeguard these resources. This theme strongly resonated in the closing session, which brought together voices from diverse career paths and regions to reflect on the current state of born-digital archives, collections, and memory, and to identify future directions.
Among the key takeaways were the need to foster data literacy and building digital citizens from a young age, as well as the importance of connecting with activists and minority communities to help them tell and preserve their stories.

Kafka24: Oxford celebrates Franz Kafka

Kafka24 logo featuring a photograph of Franz Kafka's faceTo commemorate the centenary of Franz Kafka’s death on 3 June 1924, the University of Oxford’s summer-long cultural festival Kafka24,  inspired by Kafka’s life and work, features theatre, music, cabaret, exhibitions, lectures, talks, and free family activities including the spectacular Jitterbug Tent which will land in University Parks on South Parks Road from Friday 31st May to Sunday 2nd June, and insect activities at the Museum of Natural History on the evening of 5th June.

On the evening of 3rd June, the Bodleian Libraries will host Oxford Reads Kafka in the historic Sheldonian Theatre, a public reading of Kafka’s story ‘Metamorphosis’ in which the hapless Gregor Samsa wakes up to find he’s transformed into a bug, with readers including authors Lemn Sissay, Ben Okri, and Lisa Appignanesi (tickets available online).

And on 30 May the major exhibition Kafka: Making of an Icon, featuring manuscripts from the Bodleian Library’s Kafka archive, opens in the ST Lee Gallery of the Weston Library (free admission).

The full programme of lectures and events is at www.bodleian.ox.ac.uk/kafka24.

Roger Bannister’s world record – 70th anniversary celebrations

This weekend, the city of Oxford is celebrating the anniversary of Roger Bannister’s historic sub-four-minute mile, a world record that the former Oxford (Exeter College) student broke at Oxford’s Iffley Road athletic track, 70 years ago on 6 May 1954.

In the Weston Library’s Blackwell Hall, from now until 5pm on 6 May, you will find a small display from his archive, which is now housed at the Bodleian, featuring the event programme for his world record race, original photographs, objects from his athletic career, and letters and papers that reveal his meticulous training.

Meanwhile runners across the city are invited to join the Bannister Community Mile on Monday 6 May, running from St Aldate’s to the Iffley Road Track where they will be able to enjoy the Mile Fair with more historic displays, and throughout the day, Bannister Track Mile races from invited athletes of all ages, which from 6pm will feature elite racers attempting to break the current mile records.

Spectator tickets will be free at Iffley Road, with hundreds of walk up spaces – arrive early to get your seat.

Reader notice: Library catalogue downtime

Requesting items from closed stacks

Exclamation mark graphicBetween 16 – 23 August, you will not be able to use SOLO to request items from closed stacks or offsite storage. We strongly recommend that you place any requests through SOLO by 5pm on 11 August.

Libraries will extend item due dates, and items will not be returned to the stacks during the upgrade period.

We will be running a limited service to handle urgent stack requests placed between 16 – 23 August. To place a request, email book.fetch@bodleian.ox.ac.uk. You will only be able to pick up ordered items from the Bodleian Old Library or Weston Library. Please allow 48 hours for your item to be delivered.

 

Orders for manuscript and archival material will be unaffected. Rare Books held onsite can be ordered by emailing specialcollections.bookings@bodleian.ox.ac.uk.

Please email specialcollections.enquiries@bodleian.ox.ac.uk for further assistance.

The Why and How of Digital Archiving

Guest post by Matthew Bell, Summer intern in the Modern Archives & Manuscripts Department

If you have ever wondered how future historians will reconstruct and analyse our present society, you may well have envisioned scholars wading through stacks of printed Tweets, Facebook messages and online quizzes, discussing the relevance of, for instance, GIFs sent on the comment section of a particular politician’s announcement of their candidacy, or what different E-Mail autoreplies reveal about communication in the 2010s. The source material for the researcher of this period must, after all, comprise overwhelmingly of internet material; the platform for our communication, the source of our news, the medium on which we work. To take but one example, Ofcom’s report on UK consumption of news from 2022 identifies that “The differences between platforms used across age groups are striking; younger age groups continue to be more likely to use the internet and social media for news, whereas their older counterparts favour print, radio and TV”. As this generation grows up to take the positions of power in our country, it is clear that in seeking to understand the cultural background from which they emerged, a reliance on storing solely physical newspapers will be insufficient. An accurate picture of Britain today would only be possible by careful digital archaeology, sifting through sediments of hyperlinks and screenshots.

This month, through the Oxford University Summer Internship Programme, I was incredibly fortunate to work as an intern in the Bodleian Libraries Web Archive (BLWA) for four weeks, at the cutting edge of digital archiving. One of the first things that became clear speaking to those working in the BLWA is that the world wide web as a source of research material as described above is by no means a foregone conclusion. The perception of the internet as a stable collection that will remain as it is without care and upkeep is a fallacy: websites are taken down, hyperlinks stop working or redirect somewhere else, social media accounts get removed, and companies go bankrupt and stop maintaining their online presence. Digital archiving can feel like a race against time, a push to capture the websites that people use today whilst we still can; without the constant work of web archivists, there is nothing to ensure that the online resources we use will still be available even decades down the line for researchers to consult.

Fortunately, the BLWA is far from alone in this endeavor. Perhaps the most ambitious contemporary web archive is the Internet Archive; from 1996 this archive has formed a collection of billions of websites, and states as its task the humble aim of providing “Universal Access to all Knowledge”, seeking to capture the entire internet. Other archives have a slightly more defined scope, such as the UK Web Archive, although even here the task is still an enormous one, of collecting “all UK websites at least once per year.” Because of the scale of online material that is published every day, whether or not a site has been archived by either the Internet Archive or the UK Web Archive has relevance for whether the Bodleian chooses to archive it; to this extent the world of digital archiving represents cooperation on an international scale.

One aspect of these web archives that struck me during my time here is the conscious effort made by many to place the power of web archiving in the hands of anyone with access to a computer. The Internet Archive, for instance, allows any users with a free account to add content to the archive. Furthermore, one of my responsibilities as intern was a research project into the viability of a programme named Webrecorder for capturing more complex sites such as social medias, and democratization of web archiving seems to be the key purpose of the programme. On their website, which offers free browser-based web archiving tools, the title of the company stands above the powerful rallying call “Web archiving for all!” Whilst the programme currently remains difficult to navigate without a certain level of coding knowledge, and never quite worked as expected during my research, its potential for expanding the responsibility of archiving is certainly exciting. As historians increasingly seek to understand the lives of those whose records have not generally made it into archive collections, one can see as particularly noble the desire to put secure archiving into the hands of people as well as institutions.

The “why” of Digital Archiving, then, seems clear, but what about the “how”? Before going into my main responsibilities this month, some clarification of terminology is necessary.

Capture – This refers to the Bodleian’s copy of a website, a snapshot of it at a particular moment in time which can be navigated exactly like the original.

Live Site – The website as it is available to users on the internet, as opposed to the capture.

Crawl – The process by which a website is captured, as the computer program “crawls” through the live site, clicking on all the links, copying all of the text and photographs, and gathering all of this together into a capture.

Crawl Frequency – The frequency with which a particular website is captured by the Bodleian, determined by a series of criteria including the regularity of the website’s updates.

Archive-It – The website used by the Bodleian to run these crawls, and which stores the captured websites.

Brozzler – A particularly detailed crawl, taking more time but better for dynamic or complicated sites such as social medias. Brozzlers are used for Twitter accounts, for instance. Crawls which are not brozzlers are known as standard crawls and use Heritrix software.

Data Budget – The allocated quantity of data the Bodleian libraries purchase to use on captures, meaning a necessary selectivity as to what is and is not captured.

Quality Assurance (QA) – A huge part of the work of digital archiving, the process by which a capture is compared with the live site and scrutinized for any potential problems in the way it has copied the website, which are then “patched” (fixed). These generally include missing images, stylesheets, or subpages.

Seed – The term for a website which is being captured.

Permission E-Mails – Due to the copyright regulations around web archiving, the BLWA requires permission from the owners of websites before archiving; this can be a particularly complicated task due to the difficulty of finding contact information for many websites, as well as language barriers.

My responsibilities during my internship were diverse, and my day to day work was generally split between quality assurance, setting off crawls, and sending or drafting permission e-mails. Alongside this I was not only carrying out research into Webrecorder, but also contributing to a report re-assessing the crawl frequency of several of our seeds. The work I have done this month has been not only incredibly satisfying (when the computer programme works and you are able to patch a PDF during QA of a website it makes one disproportionately happy), but rewarding. One missing image or hyperlink at a time, digital archivists are driving the careful maintenance of a particularly fragile medium, but one which is vital for the analysis of everything we are living through today.