All posts by benjaminpeirsonsmith

Oxford LibGuides: Web Archives

Web archives are becoming more and more prevalent and are being increasingly used for research purposes. They are fundamental to the preservation of our cultural heritage in the interconnected digital age. With the continuing collection development on the Bodleian Libraries Web Archive and the recent launch of the new UK Web Archive site, the web archiving team at the Bodleian have produced a new guide to web archives. The new Web Archives LibGuide includes useful information for anyone wanting to learn more about web archives.

It focuses on the following areas:

  • The Internet Archive, The UK Web Archive and the Bodleian Libraries Web Archive.
  • Other web archives.
  • Web archive use cases.
  • Web archive citation information.

Check out the new look for the Web Archives LibGuide.

 

 

IHR History Day 2018: Exploring our collections

Ahead of History Day 2018, in which the Bodleian will be in attendance, we thought we would explore the Bodleian Libraries Special Collections by asking our colleagues which items in the Special Collections are their favourites and why. The Bodleian Libraries’ Special Collections (at the Weston Library) holds the second largest collection of manuscripts and archives in Britain, the library holds collections in the following subject areas:

For a full breakdown of subjects see the Weston Library’s subject guides.

The following responses showcase the variety and depth of the materials held at the Bodleian, and also provide a unique insight into those that work with collections on a day to day basis.

Catherine McIlwaine, Archivist: Hodgkin’s Nobel Prize

Dorothy Hodgkin was an extraordinary chemist and x-ray crystallographer. During her long career at Oxford, as a Fellow at Somerville College, she determined the structure of penicillin, vitamin B12 and insulin. In 1964 she was awarded the Nobel Prize for Chemistry. She received the news when she was in Ghana visiting her husband, Thomas. They sent a telegram to their daughter, Liz, who was teaching at a school in Zambia, and to keep costs down, they sent the shortest possible message, ‘Dorothy nobel chemistry’!. She is still the only British female scientist to have received a Nobel Prize.

                                       Telegram sent to Dorothy Hodgkin, 1964

The Bodleian holds an extensive archive relating to her life and career and this telegram is part of an additional donation of family papers made in 2014 by her daughter Liz.
MS. Eng. c. 8262, fol. 134

Charlotte McKillop-Mash, Project archivist: Admiral Lord John Fisher’s letter

This is an August 1912 letter from 71-year old former first sea lord Admiral Lord John Fisher to Francis Hopwood, who was a senior civil servant serving as an additional civil lord of the Admiralty.

Fisher retired from the Admiralty in 1910 but kept himself busy by chairing a royal commission on fuel oil, strongly advocating for Britain to build more submarines, and firing off effusive and opinionated letters about rearmament. An unorthodox and radical reformer during his time in the Admiralty, Fisher was a ferociously energetic and outspoken man who, unsurprisingly, alienated plenty of his colleagues along the way.

Fisher was often outspoken in his opinions, which on occasion caused alarm. Perhaps the most notorious example was his recommendation to Edward VII in 1904 that the British should ‘Copenhagen’ the German fleet—that is, emulate the example of Nelson and attack the German fleet in Kiel before it grew too powerful [DNB]

In 1914 Winston Churchill, then first lord of the Admiralty, re-appointed Fisher as first sea lord.

The Letter reads:

Eng. c. 7351/4, fol. 48, recto

Dear Hopwood – Lane’s letter splendid! I’ve written to him. All our experts want shoving over the precipice! I heard one d-d fool the other day say “Well, thank God! We’ve not killed 15 men like the Nuremberg firm in experiments with internal combustion engines!” We ought to be heartily ashamed that we have not killed any one!

What we want is a real bloody war to re-invigorate us!

The real serious thing is that the Germans will have 14 vessels at sea with the Internal Combustion Engines before we have one, thus gaining inestimable experience. We are awfully behind!

Eng. c. 7351/4, fol. 48, verso

And we are going deliberately to order steam oil-tankers instead of their being one & all fitted with oil engines! It’s damnable! All to save a little money – no other reason! All our experts expect the Internal Combustion Engine to be perfection before adoption. They strain at the gnat of perfection and swallow the camel of un-readiness! They expect the 100,000 Horse power Internal Combustion Engines of the 32 knot armoured cruiser “Non-Pareil” to emerge perfect like Minerva out of the head of Jupiter!

Yours till charcoal sprouts!

Fisher 24.8.12

The letter can be found in the Papers of Francis John Stephens Hopwood, Baron Southborough, 1737-1945. Shelfmark: Eng. c. 7351/4, fol. 48

 

Jeremy Mcllwaine, Senior Archivist: Conservative Party’s Official Christmas Card, 1938

My favourite item is from the Conservative Party Archive, it is the Conservative Party’s official Christmas card from 1938, when the Party was riding high in the polls on the back of Chamberlain’s success in preventing war over Czechoslovakia in September 1938, and features a facsimile of the supplementary agreement reached between Chamberlain and Hitler at Munich on 30th September, 1938.

Conservative Party’s Official Christmas Card, 1938 featuring Neville Chamberlain and Adolf Hitler 

I like this item because it never fails to shock students when I show it to them during inductions on how to use archival material – not only because the idea of a British prime minister shaking hands and smiling with Hitler, who is wearing a swastika armband, seems so appalling, but because the Conservative Party chose to feature such a document within a Christmas card. But it also shows the danger of allowing hindsight to influence our interpretation of history. Chamberlain’s reputation has suffered ever since Munich because, as a policy, appeasement ultimately failed to prevent the Second World War. But at the time, there was widespread support in the country for avoiding war at all costs, even to the point of allowing Czechoslovakia to be sacrificed. On his return, Chamberlain received thousands of gifts from a grateful public, including a silver dinner service. Consideration was even given to calling a general election in order to reap the benefit of Chamberlain’s popularity, and the Conservatives would undoubtedly have won a landslide.

Historians will continue to debate appeasement and Chamberlain’s role in it, and whether, perhaps, he was right to pursue it, merely to buy Britain enough time to re-arm and prepare for War. The Christmas card is such a small document, but it represents so much controversy, not to mention a blatant example by a political party attempting to make political capital out of a crisis situation.

The postcard can be found in the Weston Library’s Conservative Party Archive, Shelfmark: CRD/D/3/1/3

Stuart Ackland, Bodleian Map Room: Clark’s Chart of the World

Population maps such as this are, in the field of cartography, a relatively recent product, with the first known examples being published in the early 1800s. Early maps would give tables showing population figures, this example from 1822 has one that is Christian only and colours parts of the World depending on the ‘Degrees of Civilization’.

Clark’s Chart of the World, 2nd ed, 1822. (E) B1 (151)

On the map’s coloured coded key the list of degrees of civilisations range from: ‘Savage’, ‘Barbarians’, ‘Half Civilised’, ‘Civilized’ and ‘Enlightened’.  In its religion section, the map  only covers Christianity (and its various denominations) and ‘Mahomedan’, with all others being listed as ‘Pagan’. The map provides a fascinating insight into British society’s views on the wider world in the 1800s.

Inset of Clark’s Map showing population table and legend

For other posts about the Bodleian’s maps see the The Bodleian Map Room blog.

Matthew Neely, Senior Archivist: The two Presidents

My favourite item is an entry for 11 June 1961 from Macmillan’s diary. In June 1961, John F. Kennedy arrived in London on his first visit as US President. Although somewhat apprehensive of the new youthful President, Macmillan and Kennedy soon forged a close relationship. Macmillan took the opportunity to record in his diary a comparison of his relationships with Eisenhower and Kennedy, contrasting the instinctive style of Eisenhower with the intellectually inquisitive approach of Kennedy.

Cecil Stoughton. White House Photographs. John F. Kennedy Presidential Library and Museum, Boston

The entry can be found in the Catalogue of the papers of Harold Macmillan, 1889-1987 Shelfmark: Macmillan dep. d. 42, fol. 69

We look forward to seeing everyone at the Bodleian’s stand for History Day 2018  at Senate House, London on the 27th of March 2018 where we will be sharing information about our collections. Make sure to come and say hello.

Co-edited by Carl Cooper and Ben Peirson-Smith.

The UK Web Archive: Online Enthusiast Communities in the UK

The beginnings of the Online Enthusiast collection of the UK Web Archive can be traced back to November 2016 and a task to scope out the viability and write a proposal for two potential special collections with a focus on current web use: Mental Health, and Online Enthusiasts.

The Online Enthusiasts special collection was intended to show how people within the UK are using the internet to aid them in practising their hobbies, for example discussing their collections of objects or coordinating their bus spotting. If it was something a person could enthuse about and it was on the internet within the UK then it was in scope. Where many UK Web Archive Special Collections are centred on a specific event and online reactions, this was more an attempt to represent the way in which people are using the internet on an everyday basis.

The first step toward a proposal was to assess the viability of the collection, and this meant searching out any potential online enthusiast sites to judge whether this collection would have enough content hosted within the UK to validate its existence. As it turns out, UK hobbyists are very active in their online communities and finding enough content was, if anything, the opposite of an issue. Difficulty came with trying to accurately represent the sheer scope of content available – it’s difficult to google something that you weren’t aware existed 5 minutes ago. After an afternoon among the forums and blogs of ferry spotters, stamp collectors, homebrewers, yarn-bombers, coffee enthusiasts and postbox seekers, there was enough proof of content to complete the initial proposal stating that a collection displaying the myriad uses hobbyists in the UK have for the internet is not only viable but also worthwhile. Eventually that proposal was accepted and the Online Enthusiast collection was born.

The UKWA Online Enthusiast Communities in the UK collection provides a unique cultural insight into how communities interact in digital spheres. It shows that with the power of the internet people with similar unique hobbies and interests can connect and share and enthuse about their favourite hobbies. Many of these communities grow and shrink at rapid paces and therefore many years of content can be lost if a website is no longer hosted.

With the amount of content on the internet, finding websites had a domino effect, where one site would link to another site for a similar enthusiast community, or we would find lists including hobbies we’d never even considered before. This meant that before long we had a wealth of content that we realised would need categorising. Our main approach to categorising the content was along thematic lines. After identifying what we were dealing with, we created a number of sub-collections, examples of which include: Animal related hobbies, collecting focused hobbies, observation hobbies, and sports.

The approach to selecting content for the collection was mainly focused around identifying UK-centric hobbies and using various search terms to identify active communities. The majority of these communities were forums. These forums provided enthusiasts with a platform to discuss various topics related to their hobbies whilst also providing the opportunity for them to share other forms of media such as video, audio and photographic content. Other platforms such as blogs and other websites were also collected, the blogs often focused on submitting content to the blog owner who would then filter and post related content to the community.

As of May 2018 the collection has over 300 archived websites. We found that the most filled categories for hobbies were Sports, collecting and animal related hobbies.

A few examples of websites related to hobbies that were new to us include:

  • UK Pidgeon Racing Forum: An online enthusiast forum concerned with pigeon racing.
  • Fighting Robots Association Forum: An online enthusiast forum for those involved with the creation of fighting robots.
  • Wetherspoon’s Carpets (Tumblr): A Tumblr blog concerned with taking photographs of the unique carpets inside the Wetherspoon’s chain of pubs across the UK.
  • Mine Exploration and History Forum: An online enthusiast community concerned with mine exploration in the UK.
  • Chinese Scooter Club Forum: An online enthusiast community concerned with all things related to Chinese scooters.
  • Knit The City (now Whodunnknit): A website belonging to a graffiti-knitter/yarnbomber from the UK

The Online Enthusiast Communities in the UK collection is accessible via the UK Web Archive’s new beta interface

PASIG 2017: Smartphones within the changing landscape of digital preservation

I recently volunteered at the PASIG 2017 Conference in Oxford, it was a great experience to learn more about the archives sector. Many of the talks at the conference focused on the current trends and influences affecting the trajectory of the industry.

A presentation that covered some of these trends in detail was a talk by Somaya Langley from Cambridge University Library (Polonsky Digital Preservation Project), her talk was featured in the ‘Future of DP theory and practice’ session. ‘Realistic digital preservation in the near future: How do we get from A to Z when B already seems too far away?’. Somaya’s presentation considered how we preserve the digital content we receive from donors on smartphones, with her focus being on iOS.

Langley, Somaya (2017): Realistic digital preservation in the near future: How to get from A to Z when B seems too far away?. figshare. https://doi.org/10.6084/m9.figshare.5418685.v1 Retrieved: 08:22, Sep 22, 2017 (GMT)

Somaya’s presentation discussed how in the field of digital preservation ingest suites have  long been used to dealing with CDs, DVDs, Floppys and HDDs. However, are not sufficiently prepared for ingesting smartphones or tablets, and the various issues that are associated with these devices. We must realise that smartphones potentially hold a wealth of information for archives:

‘With the design of the Apple Operation System (iOS) and the large amount of storage space available, records of emails, text messages, browsing history, chat, map searching, and more are all being kept’.

(Forensic Analysis on iOS Devices,  Tim Proffitt, 2012. https://uk.sans.org/reading-room/whitepapers/forensics/forensic-analysis-ios-devices-34092 )

Why iOS? What about Android?

The UK market for the iPhone (unlike the rest of Europe) shows a much closer split: iOS November 2016 Sales 48.3% versus Android 49.6% market share in the UK. This  is contrasted against the global market share that Apple have of 12.1% in Q3 of 2016.

Whatever side of the fence you stand on it is clear that smartphones in digital curation, be they Android or iOS, will both play an important role in our collections. The skills required to extract content differs across platforms, we as digital archivists will have to learn both methods of extraction and leave our consumer preferences at the door.

So how do we get the data off the iPhone?

iOS has long been known as a ‘locked-down’ operating system, and Apple have always had an anti-tinkering stance with many of their products. Therefore it should come as no surprise that locating files on an iPhone is not very straightforward.

As Somaya pointed out in her talk, after spending six hours in the Apple Shop ‘Genius Bar’ she was no closer to understanding from Apple employees what the best course of action would be to locate backups of notes from a ‘bricked’ iPhone. Therefore she used her own method of retrieving the notes, using iExplorer to search through the backups from the iPhone.

She noted however that due to limitations of iOS it was very challenging to locate these files, in some cases it even required command line to access the location for storage backups as they were hidden by default in OSX (MacOS the main operating system used by Apple Computers).

Many tools do exist for the purpose of extracting information from iPhones, the four main methods outlined in the The SANS Institute White Paper on Forensic Analysis on iOS Devices by Tim Proffitt:

  1. Acquisition via iTunes Backups (requires original PC last used to sync the iPhone)
  2. Acquiring Backup Data with iPhone Analyzer (free java-based computer program, issues exist when dealing with encrypted backups)
  3. Acquisition via Logical Methods: (uses a synchronisation method built into iOS to recover data, e.g: programs like iPhone Explorer)
  4. Acquisition via Physical Methods (obtaining a bit-by-bit copy, e.g: Lantern 2 forensics suite)

Encryption is a challenge for retrieving data off the iPhone, especially since iTunes includes an encryption of backups feature when syncing. Proffitt suggests using a password cracker or jail-breaking as solutions to this issue, however, these solutions might not be fully compatible with our archive situations.

Another issue with smartphone digital preservation is platform and version locking. Just because the above methods work for data extraction at the moment it is very possible that future versions of iOS could make then defunct, requiring software developers to consistently update their programs or look for new approaches.

Langley, Somaya (2017): Realistic digital preservation in the near future: How to get from A to Z when B seems too far away?. figshare. https://doi.org/10.6084/m9.figshare.5418685.v1 Retrieved: 08:22, Sep 22, 2017 (GMT)

Final thoughts

One final consideration that can be raised from Somaya’s talk is that of privacy. As with the arrival of computers into our archives, phones will pose similar moral questions for archivists:

Do we ascribe different values to information stored on smartphones?
Do we consider the material stored on phones more personal than data stored on our computers?

As mentioned previously, our phones store everything from emails, geo-tagged photos, phone call information, and now with the growing popularity of smart wearable-technology, health data (including user heart-rate, daily activity, weight etc.) We as digital archivists will be dealing with very sensitive personal information and need to be prepared to understand the responsibility to safeguard it appropriately.

There is no doubt that soon enough we in the archive field will be receiving more and more smartphones and tablets into our archives from donors. Hopefully talks like Somaya’s will start the ball rolling towards the creation of better standards and approaches to smartphone digital curation.

‘Getting Started with Digital Preservation’ Workshop

On the 17th of May I attended the Digital Preservation Coalition’s (DPC) ‘Getting Started with Digital Preservation’ workshop in London.

The one-day event was a great opportunity to gain clear insights into starting in the digital preservation sector, and provided a useful platform for networking with other archivists. The event consisted of lectures from DPC members on various topics related to starting digital preservation. It also included group exercises that were aimed at putting these ideas into practice.

The day started with a brief overview of digital preservation. The DPC team started by making us focus on identifying the main aspects of traditional archival preservation for physical documents. For example, a document’s physical, robust and tangible nature. Its ability to be independently understandable without relying on technology. The existence of well-established approaches to its preservation. And the existence of a well-established understanding of value-assessment relating to these documents.

This was used as a springboard to introduce us to many issues that we would face transitioning to digital. Issues like the ephemeral and intangible nature of digital (1s & 0s can’t be held in your hands). The need for technology and software for documents to be understood (e.g. a PDF file requires software to open it). Issues of obsolescence (e.g. new hardware and software making older files redundant) and lack of any value-assessment experience in the field (how do we assess the value of a set of data?).

These areas helped us to understand that digital preservation presented its own set of unique challenges that have to be understood within their own context. The question of ‘Why Digitise?’ was then asked to the attendees at the workshop. The responses focused on: legal, research, cultural heritage, funding opportunities, efficiency, contingency and access reasons for digitising. This shows us that digital preservation cannot be seen as a simple solution to a single problem but a complex solution to many.

Bit-Level Preservation was covered in detail at the workshop, this section focused on the potential dangers that could affect data and how to prevent these from occurring. The three main areas were: media obsolescence: where media type is no longer used or the hardware no longer exists to support it, media failure / decay: when the media itself runs to the end of its life cycle or breaks, and natural / human-made disaster: fire, earthquakes etc. Mitigating these dangers is achieved by backing up the data more than 2-3 times (the actual number of copies needed is a subject of debate). Then storing these copies in different geographical locations, and performing periodical migration of media to new storage devices.

The workshop also looked at integrity checks and the role they play in bit-level preservation. Integrity checking is the process of creating a ‘checksum’ or ‘hash value’ (a unique number created when running an integrity checking program like Fixity, ACE and COPTR on a file). This number is unique to that data, like a fingerprint, and can be used to check if the data has changed or become corrupted in any way due to bit-rot or other data corruption.

Fixity: https://www.avpreserve.com/tools/fixity/
ACE: https://wiki.umiacs.umd.edu/adapt/index.php/Ace
COPTR: http://coptr.digipres.org/Category.Fixity

Later in the workshop characterisation tools were demonstrated. The tool showcased was DROID (Digital, Record Object Identification). DROID is an open-source tool that analyses file types / formats on a system, it then relays this information to PRONOM, a database of file formats. The presentations stressed that the databases the tools used were important, and needed gradual updating to be accurate. Other examples of characterisation tools mentioned: C3PO, JHOVE, TIKKA, FITS.

PRONOM: http://www.nationalarchives.gov.uk/PRONOM/Default.aspx
DROID: https://sourceforge.net/projects/droid/

The presentation on departmental readiness provided useful insights into preparing for digital preservation projects. It focused on the way that maturity models could be used to benchmark your department’s readiness for digital preservation The two main models discussed were: Digital Preservation Capability Maturity Model and the NDSA Levels of Digital Preservation. These models aimed to identify gaps in the institution’s readiness for digital preservation, whilst also focusing on aspects of best practice that they could aim to achieve.

DPCMM: http://www.securelyrooted.com/dpcmm
NDSA: http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf

A risk assessment exercise also formed part of the workshop. Those attending were asked to consider how various risks would affect the digital archival process. The risks would then be ranked on their likelihood of occurring, and the potential damage that they might cause. We would then propose potential solutions to help mitigate these risks, and prevent further ‘explosive’ risks from occurring. This was followed by assessing whether the scores for both criteria had improved.

The last presentation was on digital asset registers. It focused on the importance of creating and managing a detailed spreadsheet to hold an institutions digital assets, with the aim of having one organised and accessible source of information on a digital collection. The presentation focused on how this register could be shared with all members of staff to promote a better understanding of a digital collection. It mentioned that this would remove the issue of having one staff member who was a sole specialist on a collection, and promote further transparency throughout the digital preservation process. Another idea mentioned was that the register could be used for promoting further funding into digital collections, by providing a visual representation of the digital preservation process.

I thoroughly enjoyed the DPC workshop and look forward to attending similar workshops.