Category Archives: Learning and teaching

The Gall of It!

For a long time I’ve been curious about iron gall ink. It’s a term that gets used a lot by archivists, which is unsurprising when it’s the name for the favoured ink used in Europe from the middle ages into the twentieth century, with its use spreading around the globe. It’s the ink used on the oldest document in the University Archives that mentions the University as a corporate entity (dated 1214) and it’s likely to be the same type of ink used in some of the records of the University in the 1800s.

A document, written in Latin, in black ink, on parchment

OUA/WPBeta/P/12/1 – The 1214 Award of the Papal Legate

As well as its rich colour and permanent quality (it is remarkably resistant to water), the ink is also known for a less positive aspect – over time it may “eat” through the paper. I vividly recall seeing a photograph of piece of sheet music which had been written using iron gall ink. After 200 years, the corrosion of ink had left the sheet looking like a pianola roll.

Two pages of a paper booklet, written on with black ink. In places, the ink has burned through the paper, leaving blank spaces.

MS. Rawl. D. 869 – volume one of the papers of Philip Henry Zollmann, showing the damage caused by iron gall ink

I decided, in order to appreciate the ink a little better, the best thing to do was to make some, using an original recipe, and reading secondary sources to understand the process.  The recipe I settled on (as guided by this video) was that of Ugo da Carpi, Thesauro de Scrittori (Rome, 1535), reproduced in Renaissance Secrets: Recipes and Formulas (Wheeler, V&A Publishing, 2009). The recipe reads

“Take an ounce of gallnuts crushed into little pieces. Then put into a linen cloth. Tie it up, but not too tightly. Leave to soak for at least six days in 12 ounces of rainwater. Next boil until it reduces down to 8 ounces. Strain and add a quarter ounce of German vitriol, ground to a fine powder and half an ounce of gum arabic, steeped in vinegar[…] And you will make a wondrously good ink”

Gallnuts can be found on a variety of vegetation, but are perhaps best known on oak trees, where they are often called oak galls or oak apples. They are formed after certain types of insect (often a gall wasp) lay their eggs on a tree. When the eggs hatch as larvae, the larvae secrete chemicals which irritate the tree, causing it to produce gall tissue. The gall tissue acts as both a food source for the larvae, and a protective structure in which the larvae can pupate into a wasp. Last year, I asked my family and friends to gather any galls they might see and I noticed they brought back two types – one smooth and round, and the other rather wrinkly.

On a white background are two natural spheres. They are both brown in colour. One is smooth and mottled, the other is wrinkled and darker.

Thought to be an oak apple gall (left) and an oak knopper gall (right)

A Google search reveals that the wrinkly type are caused by the oak knopper gall wasp, and I think the smoother kind are from the oak apple gall wasp (although they are rather small for this type). I wasn’t sure if I could use the wrinkly type in making ink. Most of the YouTube videos showing ink making seem to show the smoother kind, and so I discounted the wrinkled type. Initially, the galls gathered were rather green, and so I left them to ripen to the brown colour that seemed in popular use. By January, they were ready. Unfortunately, I did not have the ounce required by the recipe and so (keen to get going) I bought some galls online. This provided yet another kind of gall. Given their size and their spiky surface, I believe that these may be Aleppo galls. I also purchased Iron II Sulphate (the modern name for German vitriol, or Copperas) and some Gum Arabic.

A see-through plastic pouch containing a number of brown spheres

Bag of oak galls, purchased from an online retailer, thought to be Aleppo galls

With a sudden wealth of galls, I decided to make two batches of ink – one with primarily “home gathered” galls, and one solely made with the purchased galls.

The first stage of the recipe is to crush and steep the galls in rainwater. Typically, early January was one of the very few periods in the UK with absolutely no rain. Feeling frustrated that I had nearly all the of the ingredients I needed, I turned to Assistant Keeper Alice Millea, who had previously mentioned she had water butts in her garden. Alice kindly agreed to the (admittedly odd) request, and brought in a flask of rainwater.

I measured out two jars of 12 ounces of rain water. Next, I set to crush the two sets of galls, using a pestle and mortar, starting with the batch primarily gathered nearby. However, the pestle and mortar did not work well on this batch, as the galls were rather spongy and could best be torn apart by hand.

A pottery pestle and mortar containing brown, natural fragments

In contrast to this, the purchased galls were extremely hard and took quite some work with the pestle and mortar to reduce to a reasonable size. From da Carpi’s use of the word “crushed” in the recipe, I would presume he was most used to working with Aleppo galls, given their hardness.

Two images, side by side. The image on the left shows a cross section of a pale beige sphere. The image on the right shows a cross-section of the inside of an orange-brown sphere

The purchased galls (on the left) had to be crushed, whereas the local galls (right) were softer and could be ripped apart.

I placed the ripped and crushed galls on squares of muslin (usually used for my jam making!) and tied them into what I can best describe as giant tea-bags. I carefully lowered them into the jars of rainwater, and left them to steep for six days.

A decorated glass jam jar, nearly full of water. At the top of the water is a pouch of muslin, tied around natural contents.

The purpose of this process is to leech out gallotannic acid from the galls, which will react when further ingredients are added at a later stage of the recipe.

After six days the liquid had turned a strong tea-like colour, and there was a small amount of growth on top of the liquid.

A decorated glass jam jar, held up in front of a window. The liquid inside the jar is a pale brown-orange. There is a muslin pouch which contents at the top of the jar.

Thus, I strained the liquids again, after removing the galls, through another square of muslin, before boiling each of the liquids until they were reduced to 8 ounces in weight. Whilst the liquid was boiling, I prepared the other ingredients. I was especially struck by the beautiful pale green colour of the Iron II Sulphate.

A heart-shaped dish on top of digital scales. There is a small amount of pale green powder inside the dish.

I wasn’t sure what the recipe meant by “steeping” the Gum Arabic in vinegar, so I decided to make a reasonably thick paste. I used red wine vinegar, as I thought this might have been the most readily available vinegar accessible for most of the period in which the ink was in use.

Eventually the liquid was sufficiently reduced, and an even deeper brown colour. I decanted this into clean jars.

A dark brown liquid inside a decorated jam jar.

The first ingredient to add was the Iron II Sulphate. The reaction was immediate and impressive.

Orange-brown liquid in a jam jar turning black when a powder is added

The change from transparent brown to dusky black was striking. What’s happening is a chemical reaction. When the tannin (from the gallotannic acid) interacts with the iron sulphate, it forms a “ferrous tannate complex”, essentially a dusky-coloured pigment.

The addition of Gum Arabic serves a number of practical purposes. It acts as a suspension agent for the pigment particles present in the liquid, keeping them distributed throughout the ink. It controls the thickness and flow of the ink, ensuring it is the right consistency for writing. It also controls the absorption of the ink into the writing surface, keeping it “on the top” for a little longer, before allowing the ink to be absorbed into the paper or parchment, making for sharper, cleaner writing marks.

Vinegar isn’t present in all iron gall ink recipes, but it is credited with slowing down the settling of pigment particles to the bottom of the ink, and with inhibiting mould growth during storage.

I could hardly wait to try out the ink (having a new dip-pen ready and waiting) but I restrained myself and waited the suggested 24 hours. Both inks when opened smelled of vinegar, but not overwhelmingly so. It’s a very thin liquid, easy to overload the pen. One noticeable difference between the two inks is that the “home grown” gall ink was black the moment it hit the page, but this could have been partly due to an overabundance of ink on the pen.

"Hello World" written in black ink on a decorative cream card

In contrast to this, the bought gall ink was rather pale when first applied. However, this colour darkened within a few minutes to jet black. The reason for this change in colour is that the iron ions in the mixture oxidise with the air, producing (from the ferrous tannate complex) a ferric tannate pigment with a darker colour, and thus a darker ink.

Furthermore, whilst the ferrous tannate complex is water soluble, the ferric tannate pigment is not, making the ink water resistant.

One final piece of curiosity was, unfortunately, not to be satisfied. I have often wondered what made different varieties of the same ink act so differently. Why do some versions of the ink “eat through” parchments, whilst others affect no damage to the surface? An obvious component is the acidity of the ink, and given that the galls are a variable source of gallotannic acid, I wondered whether different batches of galls would produce different amounts of acid. The Bodleian Conservation department very kindly provided me with some pH testing strips. Unfortunately, the ink simply turned them black, preventing any readings from being taken!

A small warning to those who intend to experiment for themselves – do remember to tighten the lid before shaking the ink…

Sources, Further Reading and Watching

Wheeler, Jo, and Katy Temple. Renaissance Secrets: Recipes and Formulas. London: V&A, 2009.

https://irongallink.org/

https://en.wikipedia.org/wiki/Iron_gall_ink

https://www.rhs.org.uk/biodiversity/oak-gall-wasps

https://www.youtube.com/watch?v=xo9rbRRCBv8

Conference Report: Archives and Records Association Annual Conference 2021

The Archives and Records Association (ARA) Annual Conference 2021 was held 1st–3rd September 2021. In this blog post, Rachael Marsay reports on some of the highlights of the conference, held entirely online this year for the first time.


Logo for the Archive and Records Association 2021 Virtual Conference

There were three themes to this year’s conference: sustainability, diversity, and advocacy. Though each day of the conference covered one theme, one of the stand-outs of the conference was just how interlinked all three strands were.

Day one’s keynote speaker was Jeff James, Chief Executive and Keeper at The National Archives. Jeff talked about environmental sustainability, as well as the sustainability of the record and of the archives sector. He mentioned how The National Archives at Kew are committed to lowering their carbon footprint, which has been reduced by 80% since 2009. This has been achieved by building on scientific research with regards to buildings, bringing both a financial and environmental benefit. He also spoke of records at risk, referring to the work of the Cultural Recovery Fund, the Covid-19 Archives Fund for records at risk and the Crisis Management Team alongside already established fund streams such as the Archives Revealed grant scheme. Digital records were flagged as records at risk and he stressed the need for the sector to work in partnership and collaboration, both together and with digital giants (such as Microsoft and Google) with regards to developing digital products. Sector skills include the need for records professionals to gain digital skills through schemes and strategies such as Plugged In Powered Up, the Novice to Know-How online training resource created by the Digital Preservation Coalition, the Digital Archives Learning Exchange, and the Bridging the Gap traineeship programme.

The fragility of born-digital records, identified as critically endangered by the Digital Preservation Coalition, was a common theme throughout the conference. Even the most modern of records are at risk (CD-Rs for example, have a lifespan of under 10 years). Particular digital records discussed related to oral history interviews, often seen as ‘history from below’, recording the lives of those with ‘hidden histories’ off mainstream records, such as women and members of the LGBTQ+ community. Challenges to preserve digital material include cost, knowledge, skills and training, technology, and resources, as well as issues surrounding ‘gatekeeping’ and access to material. Rachel MacGregor (Digital Preservation Officer at The Modern Records Centre, University of Warwick) emphasised the need to record, describe, and catalogue born digital collections well in order to ensure that that they can be utilised by researchers, and explored some of the standards and guidance currently available.

Day two’s keynote speaker was Arike Oke (Managing Director, Black Cultural Archives) who spoke about experiences with diversity, aptly described as the equitable and mindful bringing together of difference; diversity should not be seen as static, but as a perpetual movement, both including and evolving difference. In her talk, Arike raised the point of classifying and being classified, and several sessions across the three days referred to how language and terminology impacted the use of records or archives created by or for particular communities. The use of historic terminology can be a barrier to access, particularly when words hold negative connotations that can cause distress to users. This was explored in several sessions in relation to LGBTQ+ related records and archives (including those kept at the Parliamentary Archives of the UK Parliament), as well as colonial collections such as the Miscellaneous Reports Collection held by the Royal Botanic Gardens in Kew. Thoughts on how to address the issues included guides or notes explaining the context and why such words were used, including modern terms or names in brackets, inviting feedback, and for events, giving participants time and space to process information.

The importance of being open to keeping more ephemeral material and objects (e.g. pin badges, leaflets and posters) was also highlighted, particularly in shedding light on lives not necessarily recorded in more traditional forms. Christopher Hilton of Britten Pears Arts gave an interesting presentation on the multitude of receipts kept by Benjamin Britten and his partner Peter Pears for tax purposes. The receipts were important in shedding light on their relationship by providing evidence that they maintained clearly separate financial lives, demonstrating how important it was for their professional lives at that period that their records could be used to demonstrate a ‘plausible deniability’ should their personal relationship be questioned. The receipts were also records of businesses in Aldeburgh which are now long gone, provoking memories for older residents and providing a tangible link between the archive and the town.

Day three’s keynote speaker was Deirdre McParland, Senior Archivist at the Electricity Supply Board (Ireland) whose inspirational talk focussed on the importance of advocacy and that ‘archives are for life, not just anniversaries’. Deirdre spoke of how archives should be pro-active and innovative when it comes to advocacy, and that projects should be strategically planned to include promotion as standard. Deirdre’s talk was followed by a talk by Jenny Moran and Robin Jenkins from the Record Office for Leicestershire, Leicester and Rutland, and Richard Wiltshire of the Crisis Management Team. Jenny, Robin and Richard talked about saving the archive of the travel firm Thomas Cook after the company’s sudden collapse: an excellent example of how swift action, negotiation and successful advocacy led to the ensured survival of the archive. The conference was nicely brought to a close by a talk by Alan and Bethan Ward on their project Photographs from Another Place. Their talk, given from the perspective of the archive user, showed how a bit of archival research revealed the names and stories behind a group of forgotten and unlabelled glass plate negatives. It was, for me at least, a timely reminder of the enduring value of archives.


A selection of further reading recommendations made by speakers and participants:

 

Frankenstein Revisited at the Bodleian Libraries

The Abinger Papers (manuscripts of the Shelley and Godwin families, including drafts of Mary Shelley’s Frankenstein) can undoubtedly be counted among some of the greatest treasures of the Bodleian Libraries and, last year, I was invited by the Bodleian Libraries’ Education Team to take part in three study days bringing the text of Frankenstein to life (as it were).

The Bodleian Libraries held two successful Frankenstein Revisited study days for KS4 and KS5 pupils from local schools in November 2019, building upon the success of three study days originally held in 2018 as part of the bicentenary celebrations of the publication of Frankenstein. Due to popular demand, a further study day was held in January 2020, but in a slightly different format. The study days were funded by the Helen Hamlyn Trust and, in total, 163 students from seven local state schools attended. The format of the days was designed to be varied and tie in with the curriculum for English Literature.

The November study days included two half-hour university style lectures (for the KS5 pupils) and a contemporary theatrical performance (‘The Two-Body Problem’ by Louis Rogers, performed by Martha Skye Murphy) followed by three ‘hands-on’ sessions when the students were split into smaller groups: one with live demonstrations of historical artefacts at the History of Science Museum, one looking at original Shelley-Godwin family manuscripts at the Weston Library, and one textual editing session focussing on the original manuscript of Frankenstein.

The creature comes to life: page from Mary Shelley’s manuscript of Frankenstein, with annotations by Percy Bysshe Shelley. Oxford, Bodleian Libraries, MS. Abinger c. 56, fol. 21r

I led the half-hour sessions with the original family manuscripts to small groups of students: though this meant running the same session several times back to back, the students all got the opportunity to get close-up to the manuscripts. Once the groups had settled down, I began a roughly chronological journey through the manuscripts charting the life of Mary Shelley: beginning with the last notes from her mother Mary Wollstonecraft to her father William Godwin on the day of her birth, through to the journals chronicling her elopement with Percy Bysshe Shelley and the death of their first child, the manuscript of Frankenstein and finishing with Percy Shelley’s ‘drowned’ notebook.

I tried to get the groups to think about the nature of a manuscript and what they thought were the major differences between a copy of the printed text and the manuscript written by Mary Shelley. I also raised the question of manuscript survival and the memorial nature of many of the items, reverently kept in turn by surviving members of the family. Percy Shelley’s water-damaged notebook also raised questions of the physicality of items: the groups were generally able to surmise what had caused the damage to the notebook and some of the older pupils were able to second-guess before I explained that it was on board Percy’s boat when he died.

Overall, the sessions were successful and we received lots of positive feedback from the students including: ‘Fascinating to see Mary Shelley’s more personal thoughts and the original, unedited tale’. The students wrote that the sessions made them ‘feel more engaged to the text’ and found it ‘amazing to be close to the story so physically’. Perhaps most importantly, it was ‘surreal and completely different to school’.

– Rachael Marsay

More information about items in the Abinger and Shelley collections can be found via Shelley’s Ghost, the Bodleian Libraries’ online exhibition, Digital Bodleian, and also The Bodleian Libraries Podcasts (BODcasts).

A longer version of this blog post was originally published on the Archives for Learning and Education Section of the Archives and Records Association’s blog on 10th April 2020.

Please note that, following guidance from the UK Government and Public Health England, the Bodleian Libraries are closed until further notice. Please check the Bodleian Libraries website and Bodleian Twitter for the latest information.

“Steps taken by the Irish government to deal with disloyalty, 11 Dec 1914”

A digitised and transcribed edition of a memo from the archive of British civil servant Francis Hopwood (Baron Southborough) is now available through the Taylor Institution Library’s Taylor Editions site. Initialled ‘MN’ by Sir Matthew Nathan, who was the Under-Secretary of Ireland from 1914-1916, the memo details the suppression of “seditious” speech in Ireland at the beginning of World War I, which included shutting down Nationalist newspapers and monitoring public speeches.

The memo formed part of a package of papers that was passed to Lord Southborough when he served as general secretary to the 1917-1918 Irish Convention. The Convention tried to find a path towards Irish self-government following the 1916 Easter Rising, however their final report, which recommended the immediate establishment of All-Ireland Home Rule, was fatally undermined by Britain’s desperate need for soldiers. In April 1918, Britain imposed conscription on Ireland and attempted to link conscription with the implementation of home rule. This move was so unpopular that public opinion swung towards full independence.

Lord Southborough’s archive is held by the Bodleian Library, and catalogued online at Bodleian Archives and Manuscripts. This fascinating collection documents his career as a senior civil servant at the Board of Trade, Colonial Office and the Admiralty and his involvement in numerous government commissions and royal tours. It includes correspondence from Winston Churchill, Admiral Lord Fisher, General Botha, Lord Midleton, Herbert Gladstone, and G.W. Balfour.

The digital edition of this memorandum on seditious speech is the product of a course on imaging, encoding and preservation offered to students, faculty and staff by the librarians of the Taylor Institution Library (the Taylorian), one of the Bodleian Libraries. You can find out more about the digital editions course and Digital Humanities on the Taylorian website.

“All the kick, the go, the cheese”: Lady Clarendon’s letters in Bodleian Student Editions

This term, the Bodleian Student Editions workshops have entered their fourth year.

Students at the 30 October workshop get acquainted with Lady Clarendon’s diaries

They continue to attract students from across the university, undergraduate and postgraduate, arts and science students. This year we have been editing the letters of Katharine, Countess of Clarendon (1810-1874), to her sister-in-law, [Maria] Theresa Lewis, and these letters are proving to be as fascinating as the very popular Penelope Maitland correspondence.  Some of the letters have been uploaded into our ongoing catalogue on Early Modern Letters Online.

Students working on Lady Clarendon’s letters

Staff and students grapple with tricky handwriting, 6 Nov 2018

These letters fulfil the criteria that we have laid down for suitable material for the workshops – they are in good condition, unpublished, interesting, readable for non-specialists, have no copyright complications, and are in a format that allows the letters to be distributed among the students in the workshop. As the students work in pairs, we require six  or seven individual letters in each workshop, with more in reserve should the transcripts be completed quickly. The perfect format is the fascicule which makes the letters much easier to handle – one fascicule can be given to each pair. Inevitably, most of the good runs of letters that fulfil these requirements tend to be in 19th-century collections of papers that were never bound. This allows us to make a virtue of necessity, because there are very large collections of 19th-century letters acquired relatively recently (i.e. post-1970) that are well worth exploring for their historical interest.

Lady Clarendon’s letters in fascicules

Selection of the Lady Clarendon letters was undertaken by myself and Balliol student Stephanie Kelley, the Balliol-Bodley scholar in early 2018, who also provided digital photographs of many of the letters. Though the workshops give access to original papers, digital images are also made available for detailed checking of difficult words.

The letters were purchased by the Bodleian in 1982, to add to the archive of her husband the 4th Earl of Clarendon already deposited here in 1949 (the 4th Earl’s papers were transferred to Library ownership in 2013). The choice of Lady Clarendon as a subject for the workshops is fortunate in that this year we have been joined by Andrew Cusworth, who is placed in the Bodleian in connection with the Prince Albert Digitisation Project. The Earl and Countess of Clarendon were intimate with Queen Victoria and Prince Albert, and court gossip is one of the interesting aspects of the letters.

Lady Clarendon to Theresa Lewis, Vice Regal Lodge, Dublin, 14 Dec 1847

George Villiers, 4th Earl of Clarendon (1800-1870), was a major political figure of the mid-Victorian period, and his wife’s letters are of considerable political interest as she was his confidante in many matters. In the period covered by the letters, Clarendon was Lord Lieutenant of Ireland from 1847 to 1852, and then Foreign Secretary from 1853 to 1858. His career therefore coincided with major events including the Irish Famine, the Young Ireland rebellion of 1848, the Crimean War and the Indian uprising known as the ‘Mutiny’. The recipient of Lady Clarendon’s letters was Maria Theresa Lewis (nee Villiers), Clarendon’s sister, and the wife of George Cornewall Lewis (1806-1863), another Liberal politician who served as Under-Secretary of State for Home Affairs from 1847 to 1850, Chancellor of the Exchequer 1855 to 1858, Home Secretary 1859 to 1861, and War Secretary from 1861 to 1863. The letters do not only discuss politics however. There is a great deal about family matters, the activities, and above all the illnesses of children, parents and other family members. Lady Clarendon’s lively style provides a very accessible glimpse of aristocratic Victorian life and preoccupations, and the student editions will provide a very useful adjunct to the catalogues of the various parts of the extensive Clarendon archives in the Bodleian.

The workshops have been kept entertained by Lady Clarendon’s fascinating take on mid-Victorian life. Here are just a few examples of her inimitable style – more extracts will follow so watch this space! All letter are to her sister-in-law Theresa Lewis.  Look out for a follow-up Blog with further extracts.

Vice Regal Lodge, 22 Sep 1847 – on the arrival of her mother-in-law in Ireland

Here is Mrs. George sick, tired, but having had a good short passage … she has blue pilled and Speedimanis’d … [Speediman’s pills were a Victorian remedy for stomach complaints]

Vice Regal Lodge, 14 Dec 1847 – on Irish troubles

Lord Clancarty told me … that Bishop Derry the Catholic Bishop of Clonfort had inadvertently let out before Lord Sligo dining out somewhere that the landlords who had been shot deserved it richly!!!! – this Bishop is a Jesuit, I believe a clever and a wily man, but saying this was a great slip…

Vice Regal Lodge, 17 Dec 1847 – forgets to report the birth of her sixth child!

George Lewis’s Board of Controul office, his most excellent début in Parliament, on your side the water, and our dreadful murders and George’s administrative atchievements on this side have been deeply interesting to us both – only think of my not mentioning George Patrick Hyde’s birth too amongst the remarkable events!!

Vice Regal Lodge, 1 Jan 1848 – ‘my unavailing head’

 … George depends upon me for writing to you for him too as tho’ always busy he is particularly overwhelmed to-day and at this moment I hear the murmuring voices of Attorney Generals and Lord Chief Justices in his room settling all sorts of coercive and improvement measures and I don’t venture even to pop my ‘unavailing’ head (as he calls it) in…

[in the same letter] – a present that is ‘all “the kick, the go, the cheese”’

… Mama is leaving us with Robert this afternoon … – they take two small parcels to London. There is a small locket of blue enamel and rose diamonds with George’s and my hair in it, which we present with a joint kiss to you as a little Xmas souvenir– There is a chatelaine in steel which is all “the kick, the go, the cheese” and which I send to Thérèse as my birthday present …

OED  chatelaine: ‘an ornamental appendage worn by ladies at their waist … consists of a number of short chains attached to the girdle or belt … bearing articles of household use and ornament, as keys, corkscrews, scissors, penknife, pin-cushion, thimble-case, watch etc …’

OED the kick: the fashion, the newest style

OED the go: the height of fashion; the ‘in’ thing, the ‘rage’.

OED the cheesecolloquialObsolete. The right, correct, or best thing; something first-rate, genuine, or exemplary.

Students share an amusing anecdote with staff.

Bodleian Student Editions workshops are organised by Helen Brown (DPhil candidate in English), Andrew Cusworth, Chris Fletcher, Miranda Lewis (Cultures of Knowledge), Olivia Thompson (DPhil candidate in Ancient History), and Mike Webb, as a collaboration between the Department of Special Collections, Centre for Digital Scholarship, and Cultures of Knowledge. All photographs by Olivia Thompson

Archives Unleashed – Vancouver Datathon

On the 1st-2nd of November 2018 I was lucky enough to attend the  Archives Unleashed Datathon Vancouver co-hosted by the Archives Unleashed Team and Simon Fraser University Library along with KEY (SFU Big Data Initiative). I was very thankful and appreciative of the generous travel grant from the Andrew W. Mellon Foundation that made this possible.

The SFU campus at the Habour Centre was an amazing venue for the Datathon and it was nice to be able to take in some views of the surrounding mountains.

About the Archives Unleashed Project

The Archives Unleashed Project is a three year project with a focus on making historical internet content easily accessible to scholars and researchers whose interests lay in exploring and researching both the recent past and contemporary history.

After a series of datathons held at a number of International institutions such as the British Library, University of Toronto, Library of Congress and the Internet Archive, the Archives Unleashed Team identified some key areas of development that would enable and help to deliver their aim of making petabytes of valuable web content accessible.

Key Areas of Development
  • Better analytics tools
  • Community infrastructure
  • Accessible web archival interfaces

By engaging and building a community, alongside developing web archive search and data analysis tools the project is successfully enabling a wide range of people including scholars, programmers, archivists and librarians to “access, share and investigate recent history since the early days of the World Wide Web.”

The project has a three-pronged approach
  1. Build a software toolkit (Archives Unleashed Toolkit)
  2. Deploy the toolkit in a cloud-based environment (Archives Unleashed Cloud)
  3. Build a cohesive user community that is sustainable and inclusive by bringing together the project team members with archivists, librarians and researchers (Datathons)
Archives Unleashed Toolkit

The Archives Unleashed Toolkit (AUT) is an open-source platform for analysing web archives with Apache Spark. I was really impressed by AUT due to its scalability, relative ease of use and the huge amount of analytical options it provides. It can work on a laptop (Mac OS, Linux or Windows), a powerful cluster or on a single-node server and if you wanted to, you could even use a Raspberry Pi to run AUT. The Toolkit allows for a number of search functions across the entirety of a web archive collection. You can filter collections by domain, URL pattern, date, languages and more. Create lists of URLs to return the top ten in a collection. Extract plain text files from HTML files in the ARC or WARC file and clean the data by removing ‘boilerplate’ content such as advertisements. Its also possible to use the Stanford Named Entity Recognizer (NER) to extract names of entities, locations, organisations and persons. I’m looking forward to seeing the possibilities of how this functionality is adapted to localised instances and controlled vocabularies – would it be possible to run a similar programme for automated tagging of web archive collections in the future? Maybe ingest a collection into ATK , run a NER and automatically tag up the data providing richer metadata for web archives and subsequent research.

Archives Unleashed Cloud

The Archives Unleashed Cloud (AUK) is a GUI based front end for working with AUT, it essentially provides an accessible interface for generating research derivatives from Web archive files (WARCS). With a few clicks users can ingest and sync Archive-it collections, analyse the collections, create network graphs and visualise connections and nodes. It is currently free to use and runs on AUK central servers.

My experience at the Vancouver Datathon

The datathons bring together a small group of 15-20 people of varied professional backgrounds and experience to work and experiment with the Archives Unleashed Toolkit and the Archives Unleashed Cloud. I really like that the team have chosen to minimise the numbers that attend because it created a close knit working group that was full of collaboration, knowledge and idea exchange. It was a relaxed, fun and friendly environment to work in.

Day One

After a quick coffee and light breakfast, the Datathon opened with introductory talks from project team members Ian Milligan (Principal Investigator), Nick Ruest (Co-Principal Investigator) and Samantha Fritz (Project Manager), relating to the project – its goals and outcomes, the toolkit, available datasets and event logistics.

Another quick coffee break and it was back to work – participants were asked to think about the datasets that interested them, techniques they might want to use and questions or themes they would like to explore and write these on sticky notes.

Once placed on the white board, teams naturally formed around datasets, themes and questions. The team I was in consisted of  Kathleen Reed and Ben O’Brien  and formed around a common interest in exploring the First Nations and Indigenous communities dataset.

Virtual Machines were kindly provided by Compute Canada and available for use throughout the Datathon to run AUT, datasets were preloaded onto these VMs and a number of derivative files had already been created. We spent some time brainstorming, sharing ideas and exploring datasets using a number of different tools. The day finished with some informative lightning talks about the work participants had been doing with web archives at their home institutions.

Day Two

On day two we continued to explore datasets by using the full text derivatives and running some NER and performing key word searches using the command line tool Grep. We also analysed the text using sentiment analysis with the Natural Language Toolkit. To help visualise the data, we took the new text files produced from the key word searches and uploaded them into Voyant tools. This helped by visualising links between words, creating a list of top terms and provides quantitative data such as how many times each word appears. It was here we found that the word ‘letter’ appeared quite frequently and we finalised the dataset we would be using – University of British Columbia – bc-hydro-site-c.

We hunted down the site and found it contained a number of letters from people about the BC Hydro Dam Project. The problem was that the letters were in a table and when extracted the data was not clean enough. Ben O’Brien came up with a clever extraction solution utilising the raw HTML files and some script magic. The data was then prepped for geocoding by Kathleen Reed to show the geographical spread of the letter writers, hot-spots and timeline, a useful way of looking at the issue from the perspective of engagement and the community.

Map of letter writers.

Time Lapse of locations of letter writers. 

At the end of day 2 each team had a chance to present their project to the other teams. You can view the presentation (Exploring Letters of protest for the BC Hydro Dam Site C) we prepared here, as well as the other team projects.

Why Web Archives Matter

How we preserve, collect, share and exchange cultural information has changed dramatically. The act of remembering at National Institutes and Libraries has altered greatly in terms of scope, speed and scale due to the web. The way in which we provide access to, use and engage with archival material has been disrupted. All current and future historians who want to study the periods after the 1990s will have to use web archives as a resource. Currently issues around accessibility and usability have lagged behind and many students and historians are not ready. Projects like Archives Unleashed will help to furnish and equip researchers, historians, students and the community with the necessary tools to combat these problems. I look forward to seeing the next steps the project takes.

Archives Unleashed are currently accepted submissions for the next Datathon in March 2019, I highly recommend it.

Oxford College Archives

A new website for Oxford College Archives has been launched at https://oac.web.ox.ac.uk/.

Painting of Oxford students entitled 'Conversation Piece, Worcester College' by Edward HallidayThe site includes a general introduction to the archives held by the Oxford colleges, individual pages on most of the colleges (with further links to catalogues etc.) and links to associated archives in the City and University.  There is also an FAQ page, a glossary of all those odd Oxford terms, and a bibliography.  The site will be enhanced and updated regularly.

Oxfam archive inspires potential University of Oxford students

Nineteen year-12 students recently attended a seminar in the Weston Library’s impressive Bahari Room as part of a summer school organised by Wadham College.

The programme allows students from schools with low application/entry rates into higher education to experience university life through a four-day residential. During the visit, students attended lectures, seminars and tutorials, giving them a taste of what it is like to be an undergraduate at the University of Oxford.

The theme for this year was ‘The Politics of Immigration’ and in the seminar, students had the chance to handle a selection of material taken from the Oxfam archive. They were then asked to discuss the representation of Palestinian refugees in the archival documents dating from the 1960s. The material used was taken from the Communications section of the archive – i.e. records of Oxfam’s external communication with the public – and is just a very small example of the material available to the public in the extensive Oxfam archive (the Communications catalogue is online here).

An example of some of the material that the students were using from the Communications section of the Oxfam archive.

Though initially hesitant, we were pleased when two eager students volunteered to open up the archival boxes and find the files that were needed. After being carefully handled by our volunteers, all the files were laid out for the students to analyse in groups.

Dr. Tom Sinclair and a student unpacking an archival box.

The students then took it in turns to give examples of how Palestinian refugees were represented in the Oxfam material. One of the excellent examples that students spotted was how Oxfam was able to remain politically neutral (a constitutional necessity for charities) by not specifying why the refugees were displaced. Students also remarked that Oxfam preferred to focus on individual stories in their communications – for instance, that of a displaced teenager with aspirations to be an engineer – which the students suggested helped humanise a crisis that could be difficult for the public to comprehend.

The students studied selected material from the Oxfam archive and gave examples of how Palestinian refugees were represented.

Overall, the ‘Politics of Immigration’ seminar was a great success that gave the students a good feel for what it would be like to use the archives to complete research for a dissertation or other academic project.

Dr Tom Sinclair, who organised the summer school, said: “It was such a privilege to be in that lovely room and have such free access to the archives… I really think that a couple of the students were inspired, and I hope they’ll be future Oxford undergraduates visiting the archives again in a few years’ time.”

Bountiful Harvest: Curation, Collection and Use of Web Archives

The theme for the ARA Annual Conference 2017 is: ‘Challenge the Past, Set the Agenda’. I was fortunate enough to attend a pre-conference workshop in Manchester, ran by Lori Donovan and Maria Praetzellis from The Internet Archive, about the bountiful harvest that is web content, and the technology, tools and features that enable web archivists to overcome the challenges it presents.

Part I – Collections, Community and Challenges

Lori gave us an insight into the use cases of Archive-it partner organisations to show us the breadth of reasons why other institutions archive the web. The creation of a web collection can be for one of (or indeed, all) the following reasons:

  • To maintain institutional history
  • To document social commentary and the perspectives of users
  • To capture spontaneous events
  • To augment physical holdings
  • Responsibility: Some documents are ONLY digital. For example, if a repository upholds a role to maintain all published records, a website can be moved into the realm of publication material.

When asked about duplication amongst web archives, and whether it was a problem if two different organisations archive the same web content, Lori put forward the argument that duplication is not worrisome. The more captures of a website is good for long term preservation in general – in some cases organisations can work together on collaborative collecting if the collection scope is appropriate.

Ultimately, the priority of crawling and capturing a site is to recreate the same experience a user would have if they were to visit the live site on the day it was archived. Combining this with an appropriate archive frequency  means that change over time can also be preserved. This is hugely important: the ephemeral nature of internet content is widely attested to. Thankfully, the misconception that ‘online content will be around forever’ is being confronted. Lori put forward some examples to illustrate the point for why the archiving of websites is crucial.

In general, a typical website lasts 90-100 days before one of the following happens:

  1. The content changes
  2. The site URL moves
  3. The content disappears completely

A study was carried out on the Occupy Movement sites archived in 2012. Of 582 archived sites, only 41% were still live on the web as of April 2014. (Lori Donovan)

Furthermore, we were told about a 2014 study which concluded that 70% of scholarly articles online with text citations suffered from reference rot over time. This speaks volumes about preserving copies in order for both authentication and academic integrity.

The challenge continues…

Lori also pointed us to the NDSA 2016/2017 survey which outlines the principle concerns within web archiving currently: Social media, (70%); Video, (69%) and Interactive media and Databases, (both 62%).  Any dynamic content can be difficult to capture and curate, therefore sharing advice  and guidelines amongst leaders in the web archiving community is a key factor in determining successful practice for both current web archivists, and those of future generations.

Part II – Current and Future Agenda

Maria then talked us through some key tools and features which enable greater crawling technology, higher quality captures and the preservation of web archives for access and use:

  • Brozzler. Definitely my new favourite portmanteau (browser + crawler = brozzler!), brozzler is the newly developed crawler by The Internet Archive which is replacing the combination of heritrix and umbra crawlers. Brozzler captures http traffic as it is loaded, works with YouTube in order to improve media capture and the data will be immediately written and saved as a WARC file. Also, brozzler uses a real browser to fetch pages, which enables it to capture embedded urls and extract links.
  • WARC. A Web ARChive file format is the ISO standard for web archives. It is a concatenated file written by a crawler, with long term storage and preservation specifically in mind. However, Maria pointed out to us that WARC files are not constructed to easily enable research (more on this below.).
  • Elasticsearch. The full-text search system does not just search the html content displayed on the web pages, it searches PDF, Word and other text-based documents.
  • solr. A metadata-only search tool. Metadata can be added on Archive-it at collection, seed and document level.

Supporting researchers now and in the future

The tangible experience and use of web archives where a site can be navigated as if it was live can shed so much light on the political and social climate of its time of capture. Yet, Maria explained that the raw captured data, rather than just the replay, is obviously a rich area for potential research and, if handled correctly, is an inappropriable research tool.

As well as the use of Brozzler as a new crawling technology, Archive-it research services offer a set of derivative data-set files which are less complex than WARC and allow for data analysis and research. One of these derivative data sets is a Longitudinal Graph Analysis (LGA) dataset file which will allow the researcher to analyse the trend in links between urls over time within an entire web collection.

Maria acknowledged that there are lessons  to be learnt when supporting researchers using web archives, including technical proficiency training and reference resources. The typology of the researchers who use web archives is ever growing: social and political scientists, digital humanities disciplines, computer science and documentary and evidence based research including legal discovery.

What Lori and Maria both made clear throughout the workshop was that the development and growth of web archiving is integral to challenging the past and preserving access on a long term scale. I really appreciated an insight into how the life cycle of web archiving is a continual process, from creating a collection, through to research services, whilst simultaneously managing the workflow of curation.

When in Manchester…

Virtual Archive, Central Library, Manchester

I  couldn’t leave  Manchester without exploring the John Rylands Library and Manchester’s Central Library. In the latter, this interactive digital representation of a physical archive combined choosing a box from how a physical archive may be arranged, and then projected the digitised content onto the screen once selected. A few streets away in Deansgate I had just enough time in John Rylands to learn that the fear of beards is called Pogonophobia. Go and visit yourself to learn more!

Special collections reading room, John Rylands Library, Manchester

Study day of Ge’ez manuscripts of Ethiopia and Eritrea

Recent months have brought an unprecedented interest in Ge’ez manuscripts of Ethiopia and Eritrea – a development that we welcome at the Bodleian. Study of this material has reached a new level, with further palaeographical and codicological knowledge, as well as a growing appreciation of art history. Studying, displaying, and digitising a variety of our little-known codices and scrolls with modern means help us better understand and disseminate our findings to new audiences.
With this in mind, on Saturday, the 17th of June we welcomed a small group of Ethiopians and Eritreans at the Bodleian to view a selection of Ge’ez manuscripts of Ethiopia and Eritrea. The material, which was studied and discussed with great excitement, included a magic scroll with miniatures of angels and demons, an illuminated seventeenth-century prayer book, fragments of a medieval gospel with evangelists’ portraits, a hagiographic work with copious illustrations to the text, an important textual variant of the Book of Enoch and the epic work Kebra Nagast (Glory of the Kings).
The experience of the day was that of beautiful exchange of ideas, as well as building bridges within and between communities. We look forward to future developments!

Engaged in discussion from left to right: Dereje Debella, Judith McKenzie, Girma Getahun, Yemane Asfedai, Gillian Evison, Madeline Slaven and Rahel Fronda. Photo credit: Mai Musié.

Studying a magic scroll, from left to right: Yemane Asfedai, Girma Getahun, Dereje Debella, Madeline Slaven and Rahel Fronda. Photo credit: Gillian Evison.

Studying a textual variant of the Ethiopian Book of Enoch, from left to right: Rahel Fronda, Dereje Debella, Girma Getahun, Yemane Asfedai, Gillian Evison and Madeline Slaven. Photo credit: Miranda Williams.