Tag Archives: Social Media data research

Algorithmic Archive Project: Use Cases (2/3)

The Algorithmic Archive project is a one year project funded by the Mellon Foundation. As part of the first Work Package, we explored how researchers from different disciplines use social media data to answer various research questions.

This post is the second in a three-part series presenting use cases drawn from research conducted as part of the Algorithmic Archive project.

We would like to thank the researchers who generously shared insights from their work.


Use Case – Exploring Algorithmic Mediation and Recommendation Systems on YouTube [1]

Research questions and aim(s):

The study sought to investigate how the YouTube platform operates, focusing on algorithmic activity and the strategies employed by both human and automated (robot) actors within federal and regional elections. The aim was to understand the impact that this system of mediation has on society and to demystify preconceptions of ideologically neutral technologies in highly disputed political events. The research focuses on two case studies: 1) the 2018 Ontario (Canada) election and 2) the 2018 Brazilian Federal Election. The data collection was carried out during the campaigning periods, between May and June in Ontario, and between August and October 2018 in Brazil.

Social media data used:

The research focussed on the sole YouTube platform. Specifically, the researchers collected information about recommended videos starting from specific keywords related to the election campaign.

Tools and methods adopted:

The data collection was carried out using a Python script developed by the Algo Transparency project. The script automates YouTube search operations based on specified keywords (e.g., the names of the candidates), allowing the researcher to gather video-related data and the relative ranking position displayed to the user. Once the keywords were defined, the tool retrieved links for the top four results for each keyword and then examined the recommendation section. This process was repeated four times, each time collecting recommended videos, simulating a user interacting with algorithmic suggestions.

Data collected was stored on personal devices and the institutional cloud, and can be visualized at the following links:


[1] Reis, R., Zanetti, D., & Frizzera, L. (2020). A conveniência dos algoritmos: o papel do YouTube nas eleições brasileiras de 2018. Compolítica10(1), 35–58. https://doi.org/10.21878/compolitica.2020.10.1.333

The Algorithmic Archive: a project overview

What is the Algorithmic Archive Project?

In 2024, the Algorithmic Archive Project has received funding from the Mellon Foundation to carry out scoping research that will ultimately support the Bodleian Libraries in the development of a lasting, interoperable infrastructure and sustainable strategies for archiving web-based data, including social media data and algorithms. The project is part of the broader Future Bodleian programme aiming to expand and evolve its centuries-old role by engaging with the digital domain.

Why archive social media data?

In the past two decades, social media platforms have become a central means of communication, enabling people from across the globe to engage in discussions that transcend geographical borders, reflect on contemporary events and contribute to collective memory. Given their profound impact on society, researchers across various disciplines increasingly rely on social media data to analyse social, economic, and political phenomena. However, social media data is inherently ephemeral, subject to continuous evolution driven by changes in platform leadership, economic gain, and shifting policies. For this reason, it is essential to preserve and provide reliable and sustainable access for the (re)use of such an important resource.

Steps towards the development of a social media and algorithmic data service.

The Algorithmic Archive project is articulated in four interconnected phases aimed to investigate the research, archiving, legal and technical landscape to inform the Bodleian Libraries’ future development of a social and algorithmic data service.

The image below offers a visual summary of the work packages that the Research Officers have been exploring over this one-year project.

In upcoming blog posts, we will present some of the results and highlight use cases drawn from research conducted with social media data.

Highlights and Takeaways from the Association of Internet Researchers Annual Conference (AoIR) 2024

At the end of October, I had the opportunity to attend the 2024 Association of Internet Researchers (AoIR) conference, which took place in the lovely city of Sheffield. This was my first time attending an AoIR conference and I was grateful to join such a vibrant meeting of Internet researchers from all over the world. As a Curatorial and Policy Research Officer for the Algorithmic Archive Project, currently exploring the ways in which social media and algorithmic data are being used across disciplines, this was a unique opportunity for me to engage with a diverse range of research on the web and social platforms.

This year’s AoIR conference was hosted by the University of Sheffield, with the Student Union building serving as the main venue. This impressive structure spans five floors and includes a cosy lounge area on the third floor, offering attendees a space to relax and network between sessions in a packed 4-day program. The main theme of this year’s AoIR2024 conference was “industry”, inviting the research community to reflect and discuss the relationship between the internet and industry. With over thirteen parallel sessions scheduled for each time block, choosing just one to attend proved to be rather challenging.

A view of the University of Sheffield, Student Union where some of the AoIR2024 conference sessions took place between 30 October – 2 November 2024. Photo taken by B. Cannelli

One aspect that really stood out to me from the conference was the diverse range of research involving information generated on social media platforms, spanning from creators’ economy dynamics, news polarization, AI applied in the context of online communities and content moderation, online pop culture and disinformation across various platforms. There were several panels discussing platform governance – the set of rules, policies and decision-making processes that shape how content is collected, accessed and used within a platform – shedding light on the power dynamics that influence user experience. From an archival perspective, understanding how platforms regulate access to data and the consumption of content is crucial, with significant implications for how this content can be archived by memory institutions.

Among the many sessions exploring virality phenomena and cultures on social media, it is worth mentioning the one reflecting on “mediated memory”. It examined how social platforms like TikTok serve, for instance, as spaces to remember displaced cultures, and how they facilitate the transmission of cultural aspects to younger generations, helping to perpetuate them through time and space. Additionally, the session titled “Times and Transformations” provided some excellent examples of research conducted with web-archived content from research libraries, along with insightful reflections on the epistemology of web archiving.

Firth Court, a Grade II listed Edwardian building that constitutes part of the Western Bank Campus of the University of Sheffield. Photo taken by B. Cannelli

Overall, the conference highlighted the crucial role social media data play in today’s communication landscape and underscored the value of platforms’ user-generated content as a key resource for researchers across a wide range of disciplines. The interplay of light and shadows explored in various panels on platform governance further emphasised the enormous power platforms hold over this user-generated data, as well as the pressing need for support to enable researchers to access and preserve these data over time. 

I left the AoIR2024 conference with so much food for thought! It has also been a fantastic opportunity for networking, which will be important for the scoping phase of the Algorithmic Archive project.