In July last year my colleague Miten and I attended a DPC Briefing Day titled Email Preservation: How Hard Can It Be? which introduced me to the work of the Task Force on Technical Approaches to Email Archives and we were lucky enough to attend the second session last week.
Arranging a second session gave Chris Prom (@chrisprom), University of Illinois at Urbana-Champaign and Kate Murray (@fileformatology), Library of Congress, co-chair’s of the Task Force the opportunity to reflect upon and add the issues raised from the first session to the Task Force Report, and provided the event attendees with an update on their progress overall, in anticipation of their final report scheduled to be published some time in April.
“Using Email Archives in Research”
The first guest presentation was given by Dr. James Baker (@j_w_baker), University of Sussex, who was inspired to write about the use of email archives within research by two key texts; Born-digital archives at the Wellcome Library: appraisal and sensitivity review of two hard drives (2016), an article by Victoria Sloyan, and Dust (2001) a book by Carolyn Steedman.
These texts led Dr. Baker to think of the “imagination of the archive” as he put it, the mystique of archival research, stemming from the imagery of 19th century research processes. He expanded on this idea, stating “physically and ontologically unique; the manuscript, is no longer what we imagine to be an archive”.
However, despite this new platform for research, Dr. Baker stated that very few people outside of archive professionals know that born-digital archives exist, yet alone use them. This is an issue, as archives require evidence of use, therefore, we need to encourage use.
To address this, Dr. Baker set up a Born-Digital Access Workshop, at the Wellcome Library in collaboration with their Collections Information Team, where he gathered people who use born-digital archives and the archivists who make them, and provided them with a set of 4 varying case-studies. These 4 case-studies were designed to explore the following:
A) the “original” environment; hard drive files in a Windows OS
B) the view experience; using the Wellcome’s Viewer
C) levels of curation; comparing reformatted and renamed collections with unaltered ones
D) the physical media; asking does the media hold value?
Several interesting observations came out of this workshop, which Dr. Baker organised in to three areas:
- Levels of description; filenames are important, and are valuable data in themselves to researchers. Users need a balance between curation and an authentic representation of the original order.
- “Bog-standard” laptop as access point; using modern technology that is already used by many researchers as the mode of access to email and digital archives creates a sense of familiarity when engaging with the content.
- Getting the researcher from desk to archive; there is a substantial amount of work needed to make the researcher aware of the resources available to them and how – can they remote access, how much collection level description is necessary?
Dr. Baker concluded that even with outreach and awareness events such as the one we were all attending, born-digital archives are not yet accessible to researchers, and this has made me realise the digital preservation community must push for access solutions, and get these out to users, to enable researchers to gain the insights they might from our digital collections.
“Email as a Corporate Record”
The third presentation of the day was given by James Lappin (@JamesLappin), Loughborough University, who discussed the issues involved in applying archival policies to emails in a governmental context.
His main point concerned the routine deletion of email that happens in governments around the world. He said there are no civil servants email accounts scheduled to be saved past the next 3 – 4 years – but, they may be available via a different structure; a kind of records management system. However, Lappin pointed out the crux in this scenario: government departments have no budget to move and save many individuals email accounts, and no real idea of the numerics: how much to save, how much can be saved?
“email is the record of our age” – James Lappin
Lappin suggested an alternative: keep the emails of the senior staff only, however, this begs the questions, how do we filter out sensitive and personal content?
Lappin posits that auto-deletion is the solution, aiming to spare institutions from unmanageable volumes of email and the consequential breach of data protection.
- governments to kickstart email preservation action,
- the integration of tech for records management solutions,
- actively considering the value of emails for long-term preservation
But how do we transfer emails to a EDRMS, what structures do we use, how do we separate individuals, how do we enforce the transfer of emails? These issues are to be worked out, and can be, Lappin argues, if we implement auto-deletion as tool to make email preservation less daunting , as at the end of the day, the current goal is to retain the “important” emails, which will make both government departments and historians happy, and in turn, this makes archivists happy. This does indeed seem like a positive scenario for us all!
However, it was particularly interesting when Lappin made his next point: what if the very nature of email, as intimate and immediate, makes governments uncomfortable with the idea of saving and preserving governmental correspondence? Therefore, governments must be more active in their selection processes, and save something, rather than nothing – which is where the implementation of auto-deletion, could, again, prove useful!
To conclude, Lappin presented a list of characteristics which could justify the preservation of an individuals government email accounts, which included:
- The role they play is of historic interest
- They expect their account to be permanently preserved
- They are given the chance to flag or remove personal correspondence
- Access to personal correspondence is prevented except in case of overriding legal need
I, personally, feel this fair and thorough, but only time will tell what route various governments take.
On a side note: Lappin runs an excellent comic-based blog on Records Management which you can see here.
One of the key issues that stood out for me today was, maybe surprisingly, not to do with the technology used in email preservation, but how to address the myriad issues email preservation brings to light, namely the feasibility of data protection, sensitivity review and appraisal, particularly prevalent when dealing in such vast quantities of material.
Email can only be preserved once we have defined what constitutes ’email’ and how to proceed ethically, morally and legally. Then, we can move forward with the implementation of the technical frameworks, which have been designed to meet our pre-defined requirements, that will enable access to historically valuable, and information rich, email archives, that will yield much in the name of research.
In the tweet below, Evil Archivist succinctly reminds us of the importance of maintaining and managing our digital records…
— Evil Archivist (@EvilArchivist) October 11, 2017