— somaya langley (@criticalsenses) September 12, 2017
Last week I was lucky enough to be able to attend the PASIG 2017 (Preservation and Archiving Special Interest Group) conference, held at the Oxford University Museum of Natural History, where over the course of three days the digital preservation community connected to share, experiences, tools, successes and mishaps.
The story of one such mishap came from Eduardo del Valle, Head of the Digitization and Open Access Unit at the University of the Balearic Islands (UIB), in his presentation titled “Sharing my loss to protect your data: A story of unexpected data loss and how to do real preservation”. In 2013 the digitisation and digital preservation workflow pictured below was set up by the IT team at UIB.
Del Valle was told this was a reliable system, with fast retrieval. However, he found this was not the case, with slow retrieval and the only means of organisation consisting of an excel spreadsheet used to contain the storage locations of the data.
In order to assess their situation they used the NDSA Levels of Digital Preservation, a tiered set of recommendations on how organisations should build their digital preservation activities, developed by the National Digital Stewardship Alliance (NDSA) in 2012. The guidelines are organised into five functional areas that lie at the centre of digital preservation:
- Storage and geographic location
- File fixity and data integrity
- Information security
- File formats
These five areas then have four columns (Levels 1-4) which set tiered recommendations of action, from Level 1 being the least an organisation should do, to Level 4 being the most an organisation can do. You can read the original paper on the NDSA Levels here.
The slide below shows the extent to which the University met the NDSA Levels. They found there was an urgent need for improvement.
“Anything that can go wrong, will go wrong” – Eduardo del Valle
In 2014 the IT team decided to implement a new back up system. While the installation and configuration of the new backup system (B) was completed, the old system (A) remained operative.
On the 14th and 15th November 2014, a backup was created for the digital material generated during the digitisation of 9 rare books from the 14th century in the Tape Backup System (A) and notably, two confirmation emails were received, verifying the success of the backup. By October 2015, all digital data had been migrated from System (A) to the new System (B), spanning UIB projects from 2008-2014.
However, on 4th November 2014, a loss of data was detected…
The files corresponding to the 9 digitised rare books were lost. This loss was detected a year after the initial back up of the 9 books in System A, and therefore the contract for technical assistance had finished. This meant there was no possibility of obtaining financial compensation, if the loss was due to a hardware or software problem. The loss of these files, unofficially dubbed “the X-files”, meant the loss of three months of work and it’s corresponding economic loss. Furthermore, the rare books were in poor condition, and to digitise them again could cause serious damage. Despite a number of theories, the University is yet to receive an explanation for the loss of data.
To combat issues like this, and to enforce best practice in their digital preservation efforts, the University acquired Libsafe, a digital preservation solution offered by Libnova. Libsafe is OAIS and ISO 14.721:2012 compliant, and encompasses advanced metadata management with a built-in ISAD(g) filter, with the possibility to import any custom metadata schema. Furthermore, Libsafe offers fast delivery, format control, storage of two copies in disparate locations, and a built-in catalogue. With the implementation of a standards compliant workflow, the UIB proceeded to meet all four levels of the 5 areas of the NDSA Levels of Digital Preservation.
The ISO 14.721:2012 Space Data and Information Transfer Systems – Open Archival Information System – Reference Model (OAIS) provides a framework for implementing the archival concepts needed for long-term digital preservation and access, and for describing and comparing architectures and operations of existing and future archives, as well as describing roles, processes and methods for long-term preservation.
The use of these standards facilitates the easy access, discovery and sharing of digital material, as well as their long-term preservation. Del Valle’s story of data loss reminds us of the importance of implementing standards-based practices in our own institutions, to minimise risk and maximise interoperability and access, in order to undertake true digital preservation.
With thanks to Eduardo del Valle, University of the Balearic Islands.