Category Archives: Digital archives

#WAWeek2017 – Researchers, practitioners and their use of the archived web

This year, the world of web archiving  saw a premiere: not only were the biennial RESAW conference and the IIPC conference, established in 2016, held jointly for the first time, but they also formed part of a whole week of workshops, talks and public events around web archives – Web Archiving Week 2017 (or #WAWeek2017 for the social medially inclined).

After previous conferences Reykjavik (2016) and Arhus (RESAW 2015), the big 2017 event was held in London, 14-16 June 2017, organised jointly by the School of Advanced Studies of the University of London, the IIPC and the British Library.
The programme was packed full of an eclectic variety of presentations and discussions, with topics ranging from the theory and practice of curating web archive collections or capturing whole national web domains, via technical topics such as preservation strategies, software architecture and data management, to the development of methodologies and tools for using web archives based research and case studies of their application.

Even in digital times, who doesn’t like a conference pack? Of course, the full programme is also available online. (…but which version will be easier to archive?)

Continue reading

Researchers,practitioners and their use of the archived web. IIPC Web Archiving Conference 15th June 2017

From the 14th – 16th of June researchers and practitioners from a global community came together for a series of talks, presentations and workshops on the subject of Web Archiving at the IIPC Web Archiving Conference. This event coincided with Web Archiving Week 2017, a week long event running from 12th – 16th June hosted by the British Library and the School of Advance Study

I was lucky enough to attend the conference  on the 15th June with a fellow trainee digital archivist and listen to some thoughtful, engaging and challenging talks.

The day started with a plenary in which John Sheridan, Digital Director of the National Archives, spoke about the work of the National Archives and the challenges and approaches to Web Archiving they have taken. The National Archives is principally the archive of the government, it allows us to see what the state saw through the state’s eyes. Archiving government websites is a crucial part of this record keeping as we move further into the digital age where records are increasingly born-digital. A number of points were made which highlighted the motivations behind web archiving at the National Archives.

  • They care about the records that government are publishing and their primary function is to preserve the records
  • Accountability for government services online or information they publish
  • Capturing both the context and content

By preserving what the government publishes online it can be held accountable, accountability is one aspect that demonstrates the inherent value of archiving the web. You can find a great blog post on accountability and digital services by Richard Pope in this link.  http://blog.memespring.co.uk/2016/11/23/oscon-2016/

The published records and content on the internet provides valuable and crucial context for the records that are unpublished, it links the backstory and the published records. This allows for a greater understanding and analysis of the information and will be vital for researchers and historians now and into the future.

Quality assurance is a high priority at the National Archives. By having a narrow focus of crawling, it has allowed for but also prompted a lot of effort to be directed into the quality of the archived material so it has a high fidelity in playback. To keep these high standards it can take weeks in order to have a really good in-depth crawl. Having a small curated collection it is an incentive to work harder on capture.

The users and their needs were also discussed as this often shapes the way the data is collected, packaged and delivered.

  • Users want to substantiate a point. They use the archived sites for citation on Facebook or Twitter for example
  • The need to cite for a writer or researcher
  • Legal – What was the government stance or law at the time of my clients case
  • Researchers needs – This was highlighted as an area where improvements can be made
  • Government itself are using the archives for information purposes
  • Government websites requesting crawls before their website closes – An example of this is the NHS website transferring to a GOV.UK site

The last part of the talk focused on the future of web archiving and how this might take shape at the National Archives. Web archiving is complex and at times chaotic. Traditional archiving standards have been placed upon it in an attempt to order the records. It was a natural evolution for information managers and archivists to use the existing knowledge, skills and standards to bring this information under control. This has resulted in difficulties in searching across web archives, describing the content and structuring the information. The nature of the internet and the way in which the information is created means that uncertainty has to inevitably be embraced. Digital Archiving could take the turn into the 2.0, the second generation and move away from the traditional standards and embrace new standards and concepts. One proposed method is the ICA Records in Context conceptual model. It proposes a multidimensional description with each ‘ thing ‘ having a unique description as opposed to the traditional unit of description (one size fits all).  Instead of a single hierarchical fonds down approach, the Records in Context model uses a  description that can be formed as a network or graph. The context of the fonds is broader, linking between other collections and records to give different perspectives and views. The records can be enriched this way and provide a fuller picture of the record/archive. The web produces content that is in a constant state of flux and a system of description that can grow and morph over time, creating new links and context would be a fruitful addition.

Visual Diagram of How the Records in Context Conceptual Model works

“This example shows some information about P.G.F. Leveau a French public notary in the 19th century including:
• data from the Archives nationales de France (ANF) (in blue); and
• data from a local archival institution, the Archives départementales du Cher (in yellow).” INTERNATIONAL COUNCIL ON ARCHIVES: RECORDS IN CONTEXTS A CONCEPTUAL MODEL FOR ARCHIVAL DESCRIPTION.p.93

 

Traditional Fonds Level Description

 

I really enjoyed the conference as a whole and the talk by John Sheridan. I learnt a lot about the National Archives approach to web archiving, the challenges and where the future of web archiving might go. I’m looking forward to taking this new knowledge and applying it to the web archiving work I do here at the Bodleian.

Changes are currently being made to the National Archives Web Archiving site and it will relaunch on the 1st July this year.  Why don’t you go and check it out.

 

 

 

Why archive the web?

Here at the Bodleian Libraries’ Web Archive (BLWA), the archiving process starts with a nomination – either by our web curators or by you, the public. The nominated URLs the BLWA team then select for archiving are those specifically identified as being of lasting value and significance for preservation.

Not only are the sites chosen from a preservation standpoint – we are also continually seeking to build up the scope and content of our 7 collections within the BLWA: University of Oxford; University of Oxford colleges; University of Oxford museums, libraries and archives; social sciences; arts and humanities; international and science, medicine and technology. Exactly like the use of a physical collection, the sites belonging to the web collection will be used for research, fact checking, discovery and collaboration. There can be no denying that the web is the platform on which so much of contemporary society occurs. In the future then, and indeed now, web archives are providing an insight into our history.

Anti-Apartheid Movement Archives – http://www.aamarchives.org/

The AAMA site is part of our international collection in the BLWA. Within this collection we have captured the aamarchives.org 7 times since 24th November 2015. This online platform is vital for digital access to further research, cross-cultural relationships and efforts towards understanding the history of the British Anti-Apartheid Movement 1959 – 1994. This capture has preserved the navigation and functionality of the site and links still resolve; for example the user community can still browse the archive, learn about campaigns and download resources. The date and time is clearly displayed in the banner at the top.

BLWA’s first capture of the online AAMA

This website can also be used and explored in conjunction with our related physical holdings. Here at the Bodleian Special Collections we have an amazing depth and range of physical material in the Anti-Apartheid Movement archive and our Commonwealth and African studies collections. You can browse the catalogue for this here.

This archived capture is fully functional, like a live site.

This is a tangible example of how digital preservation enhances and complements physical material and ensures records can reach a wider audience. How exciting it is that a researcher can consult manuscript or archived material, alongside captures of websites from the past in order to gain more of an insight and have a wider scope of substance to survey!

Web content like the aamarchives.org/ is not as stable as you might presume. A repository of web based collections enables future discovery of internet sites that are perhaps taken for granted due to the nature of our technological society; everything is just a tap or a click away. In fact, much of the material we interact with today is only available online. The truth is that web content is ephemeral: there is a very real threat that it can rapidly change and disappear altogether. Therefore web archiving initiatives are vital to preserve these valuable resources for good. Through these captures, provenance, arrangement and content have been preserved; and arguably most importantly of all – access.

Both individual collections and the web archive as a whole can be searched for a specific site, or browsed at leisure.

Growth of open access and web based initiatives mean that there is an ever increasing network of digital libraries on a global scale. There is no doubt that the practice of web archiving is a significant contribution towards ensuring knowledge for all. Access to the Internet enabling access to an ever growing knowledge depository is central to the integrity of educational and professional research, web archiving and on a larger scale, digital preservation.

Browse our collections in Bodleian Libraries’ Web Archive

Get involved and help preserve our history! Nominate a site to archive

‘Getting Started with Digital Preservation’ Workshop

On the 17th of May I attended the Digital Preservation Coalition’s (DPC) ‘Getting Started with Digital Preservation’ workshop in London.

The one-day event was a great opportunity to gain clear insights into starting in the digital preservation sector, and provided a useful platform for networking with other archivists. The event consisted of lectures from DPC members on various topics related to starting digital preservation. It also included group exercises that were aimed at putting these ideas into practice.

The day started with a brief overview of digital preservation. The DPC team started by making us focus on identifying the main aspects of traditional archival preservation for physical documents. For example, a document’s physical, robust and tangible nature. Its ability to be independently understandable without relying on technology. The existence of well-established approaches to its preservation. And the existence of a well-established understanding of value-assessment relating to these documents.

This was used as a springboard to introduce us to many issues that we would face transitioning to digital. Issues like the ephemeral and intangible nature of digital (1s & 0s can’t be held in your hands). The need for technology and software for documents to be understood (e.g. a PDF file requires software to open it). Issues of obsolescence (e.g. new hardware and software making older files redundant) and lack of any value-assessment experience in the field (how do we assess the value of a set of data?).

These areas helped us to understand that digital preservation presented its own set of unique challenges that have to be understood within their own context. The question of ‘Why Digitise?’ was then asked to the attendees at the workshop. The responses focused on: legal, research, cultural heritage, funding opportunities, efficiency, contingency and access reasons for digitising. This shows us that digital preservation cannot be seen as a simple solution to a single problem but a complex solution to many.

Bit-Level Preservation was covered in detail at the workshop, this section focused on the potential dangers that could affect data and how to prevent these from occurring. The three main areas were: media obsolescence: where media type is no longer used or the hardware no longer exists to support it, media failure / decay: when the media itself runs to the end of its life cycle or breaks, and natural / human-made disaster: fire, earthquakes etc. Mitigating these dangers is achieved by backing up the data more than 2-3 times (the actual number of copies needed is a subject of debate). Then storing these copies in different geographical locations, and performing periodical migration of media to new storage devices.

The workshop also looked at integrity checks and the role they play in bit-level preservation. Integrity checking is the process of creating a ‘checksum’ or ‘hash value’ (a unique number created when running an integrity checking program like Fixity, ACE and COPTR on a file). This number is unique to that data, like a fingerprint, and can be used to check if the data has changed or become corrupted in any way due to bit-rot or other data corruption.

Fixity: https://www.avpreserve.com/tools/fixity/
ACE: https://wiki.umiacs.umd.edu/adapt/index.php/Ace
COPTR: http://coptr.digipres.org/Category.Fixity

Later in the workshop characterisation tools were demonstrated. The tool showcased was DROID (Digital, Record Object Identification). DROID is an open-source tool that analyses file types / formats on a system, it then relays this information to PRONOM, a database of file formats. The presentations stressed that the databases the tools used were important, and needed gradual updating to be accurate. Other examples of characterisation tools mentioned: C3PO, JHOVE, TIKKA, FITS.

PRONOM: http://www.nationalarchives.gov.uk/PRONOM/Default.aspx
DROID: https://sourceforge.net/projects/droid/

The presentation on departmental readiness provided useful insights into preparing for digital preservation projects. It focused on the way that maturity models could be used to benchmark your department’s readiness for digital preservation The two main models discussed were: Digital Preservation Capability Maturity Model and the NDSA Levels of Digital Preservation. These models aimed to identify gaps in the institution’s readiness for digital preservation, whilst also focusing on aspects of best practice that they could aim to achieve.

DPCMM: http://www.securelyrooted.com/dpcmm
NDSA: http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf

A risk assessment exercise also formed part of the workshop. Those attending were asked to consider how various risks would affect the digital archival process. The risks would then be ranked on their likelihood of occurring, and the potential damage that they might cause. We would then propose potential solutions to help mitigate these risks, and prevent further ‘explosive’ risks from occurring. This was followed by assessing whether the scores for both criteria had improved.

The last presentation was on digital asset registers. It focused on the importance of creating and managing a detailed spreadsheet to hold an institutions digital assets, with the aim of having one organised and accessible source of information on a digital collection. The presentation focused on how this register could be shared with all members of staff to promote a better understanding of a digital collection. It mentioned that this would remove the issue of having one staff member who was a sole specialist on a collection, and promote further transparency throughout the digital preservation process. Another idea mentioned was that the register could be used for promoting further funding into digital collections, by providing a visual representation of the digital preservation process.

I thoroughly enjoyed the DPC workshop and look forward to attending similar workshops.

 

Digital Preservation Workshop

It was a real privilege to attend the Digital Preservation Coalition’s workshop, ‘Getting Started with Digital Preservation’ in London on 17th May 2017. As a newcomer to this topic I was eager to learn more, and the workshop definitely didn’t disappoint, providing me with a fantastic insight into the tools recommended for digital preservation, the challenges it presents, and the solutions that can be used to overcome these.

The workshop began with an introduction to digital preservation, defined neatly by Sharon McMeekin (Head of Training and Skills) as the active management of digital content over a period of time to ensure continued access. We learnt about the sorts of features systems should incorporate to allow for continued access to digital content. These included:

• Resilience, standards, and open to testing
• Error checking, compatibility to multi-media, and back-up
• Authenticity checking

As the morning progressed it was interesting to learn more about some of the difficulties that digital preservation presents including:

• Media obsolesce
• Media failure or decay (otherwise known as ‘bit-rot’).
• Natural disaster
• Man-made error
• Malicious damage
• Viruses
• Network failure
• Disassociation

Methods of dealing with these challenges included: storing more than one copy in different geographic locations, refreshing storage media, and integrity checking, also known as ‘fixity checking’ which is the process of checking if a digital file has remained unchanged.

As part of this final solution we also learnt about ‘checksums’ which are like ‘digital fingerprints’ also used to check if the contents of a file have altered.

The DPC also recommended generating a risk register as a further preventative measure to protect digital material against potential hazards. We even had a go at creating our own digital register based on a fictional scenario. This involved recording the:

•  Type of risk
• Consequence of risk
• Likelihood  of occurrence
• Impact on institution
• Frequency
• Owner
• Response/solution
• New Likelihood of occurrence

As well as safeguarding digital material, we learnt that a risk register has the added benefit of introducing clearer planning within an institution, serving as an advocacy tool, highlighting clearer responsibilities, and benefitting the Digital Asset Register.  DPC recommended that institutions use DRAMBORA, a digital repository audit method based on risk assessment which encourages organisations to generate an awareness of their objectives and activities before identifying and managing the risks to their digital collections.

Finally, Digital Asset Registers were recommended as useful tools for digital preservation coordination since they gather all of the digital information into one place and log preservation risks to collections. They also provide intuitions with a finding aid in the absence of other discovery methods and support best practice and advocacy.

The characterisation tool DROID was also mentioned as a useful software application for identifying file formats. Developed by the National Archives, this tool records the number, size, and format of each file in addition to creating a checksum for each.

The workshop was a wonderful opportunity to learn more about digital preservation and meet with other professionals from the same field. I am now really looking forward to undertaking some of my own digital preservation and archiving projects at the Bodleian.

Thai Manuscript Conservation Association Workshop at the Bodleian

On 14th and 15th December staff from Bodleian Special Collections and Digital Library Systems and Services welcomed representatives from the Manuscript Conservation Association of Thailand. Delegates included Mr. Boonlert Sananon, President of the MCA, Mr. Boonlue Burarnsan, Vice President of the MCA, and Mrs. Phatchanun Bunnag, Registrar of the MCA.

P1010035_resize

During the first day of the workshop delegates discussed the latest developments in TEI /XML cataloguing standards for Thai manuscripts at the Centre for Digital Scholarship. On the morning of second day of the workshop the delegates visited the Conservation workshop. This was followed by a lecture by given Mr Saneh Mahapol, from the Fine Arts Department of the Ministry of Culture on the conservation of palm leaf books in Thailand.

The workshop ended with delegates helping the library to identify and make basic TEI descriptions of uncatalogued Thai manuscripts in the Bodleian’s collection.

P1010075_resize

iPRES 2016

Last month, I attended the 13th International Conference on Digital Preservation, this year hosted in Bern, Switzerland. The four days of papers, panels, posters and workshops were an intensive and exciting opportunity to meet with colleagues working in digital preservation around the world, share ideas, and hear about innovative projects and approaches. The topics ranged widely from technical systems and practices, to quality and risk assessment, and stewardship and sustainability. What follows are just a couple of highlights from a really fascinating week.

Networking wall

The post-it note networking wall: What do you know? What do you want to know?

Net-based and digital art

As email, digital documents and social media replace traditional forms of communication, it is crucial to be able to preserve born-digital material and make it accessible. An area which I hadn’t previously considered was the realm of net-based art. Here, the internet is used as an artistic medium, which of course has implications (and complications) for digital preservation.

In her key-note speech, Sabine Himmelsbach from the House of Electronic Arts in Basel, introduced us to this exciting field, showing artwork such as Olia Lialina’s ‘Summer’, 2013, shown below.

Summer, by Olia Lialina

Screenshot of Summer, Olia Lialina, 2013. Available at https://www.youtube.com/watch?v=SxvHoXdC4Uk

The artwork features an animated loop of Lialina swinging from the browser bar. Each frame is hosted by a different website, and the playback therefore depends on your connection speed. This creative use of technology creates enormous challenges for preservation. Here, rather than preserving artefacts, it is the preservation of behaviours which is crucial, and these behaviours are extremely vulnerable to obsolescence.

Marc Lee’s ‘TV Bot’ is another net-based artwork, which is automated to broadcast current news stories with live TV streams, radio streams and webcam images from around the world. Reliant on technical infrastructure in this way, the shift from Real Player to Adobe Flash Player was one such development which prevented ‘TV Bot’ from functioning. The artist then not only worked on technical migration, but re-interpreted the artwork, modernising the look and feel, resulting in ‘TV Bot 2.0’ in 2010. This process soon happened again, this time including a twitter stream, in ‘TV Bot 3.0’, 2016. In this way, the artist is working against cultural, as well as technical obsolescence.

Marc Lee, 'TV Bot 2.0', 2010. Image from http://ceaac.org/en/artistes/marc-lee

Marc Lee, ‘TV Bot 2.0’, 2010. Image from http://ceaac.org/en/artistes/marc-lee

The heavy involvement from the artist in this case has helped preserve the artwork, but this process cannot be sustained indefinitely. Himmelsbach ended her speech by stressing the need for collaboration and dialogue, which emerged as a central theme of the conference.

A new approach to web archiving

Another highlight was the workshop on Webrecorder lead by Dragan Espenschied from Rhizome. He introduced their new tool which departs from the usual crawling method to capture web content ‘symmetrically’, which results in incredibly high-fidelity captures. The demonstration of how the tool can capture dynamic and interactive content sparked gasps of amazement from the group!

Webrecorder not only captures social media, embedded video and complex javascript (often tricky with current tools), but can actually capture the essence of an individual’s interaction with the web-content.

How it works: Webrecorder records all the content you interact with during the recording session. Users are then able to interact with the content themselves, but anything that was not viewed during the recording session will not be available to them.

Current web archiving strategies aren’t able to capture the personalised nature of web use. How to use this functionality is still a big question, as a web recording in this way would be personal to the web archivist: showing what they decided to explore, unless a systematic approach was designed by an institution. This itself would be very resource-intensive, and is arguably not where the potential of Webrecorder lies: the ability to capture dynamic content, such as net-based artworks. However, the possibility of preserving not only web content, but our interaction with it, is a very exciting development.iPRES 2016 balloon

iPRES 2016 was a fantastic opportunity to gain insight into projects happening around the world to further digital preservation. It showed me that often there are no clear answers to ‘which file format is best for that?’ or ‘how do I preserve this?’ and that seeking advice from others, and experimenting, is often the way forward. What was really clear from attending was that the strength and support of the community is the most valuable digital preservation tool available.

 

Capturing and Preserving the EU Referendum Debate (Brexit) – UK Web Archive blog

Following the announcement in May 2015 that there would be a referendum on the UK’s EU membership, the Legal Deposit UK Web Archive, led by curators at the Bodleian Libraries, started a collection of websites.

The team of curators includes contributors from the Bodleian Libraries, The British Library, the National Libraries of Scotland and Wales and also Queen’s University Belfast (for the Northern Ireland perspective) and the London School of Economics (for capturing and preserving individual documents, such as the pdf versions of campaigning leaflets).

The collection scope is to capture the ‘Brexit’ debate and the debate around the EU Referendum as well as the wider context of UK/EU relations, including:

  • Media coverage
  • websites of political parties and other political institutions and groups
  • campaigning and lobbying
  • trade unions, professional organisations, businesses
  • academic debate
  • culture and arts
  • public opinion through blogs, comments, and if possible social media.

We primarily archive UK websites under the Non-Print Legal Deposit mandate, but also decided to include some sites outside the UK, if relevant – e.g. websites of UK expats in Europe, or political parties, interest groups and think tanks in the EU and in EU member states – on a permission basis.

The collection (at the time of writing) has 2590 target websites. Some of these are whole websites; others will be a single news story or blog post.

Access and availability
The majority of the collection will be available in the reading rooms of UK Legal Deposit libraries, including both British Library sites, the Bodleian Libraries in Oxford, the National Library of Scotland, the National Library of Wales, Cambridge University Library and Trinity College Dublin. As is usual for web archive collections, there is a delay between collection and availability of up to a year, allowing for cataloguing and for ingest into digital library systems.

by Svenja Kunze, Project Archivist, Bodleian Libraries (Oxford University)

Source: Capturing and Preserving the EU Referendum Debate (Brexit) – UK Web Archive blog

EU Referendum Web Archiving Mini-internship – Part 1

On 20 and 21 June eight Oxford University students took part in a web archiving micro-internship at the Weston Library’s Centre for Digital Scholarship. Working with the UK Legal Deposit Web Archive, they contributed to the curation of a special collection of websites on the UK European Referendum. This is the first of two guest blog posts on the micro-internship.

Web archiving micro-interns on the roof of the Weston Library, June 2016.

Web archiving micro-interns on the roof of the Weston Library, June 2016.

Using library archives for their research is not a novelty for any student or scholar. However, web archives represent a completely new dimension of swiftly evolving research methods – they intend to document what is posted online – a  relatively recent form of data collection due to scientific advancements.

For researchers used to traditional archives, the need to store and analyse this data might be not really understandable, however, web archiving, despite being relatively new, is very significant. Firstly, it allows us to store information for generations of future historians and sociologists – contrary to the common perception, many data held on World Wide Web disappears or changes very frequently and rapidly. Secondly, it might be an asset for those pursuing topical research projects in the present – recent technologies (such as prototype SHINE database for historical research) allow us to trace data trends and come to important and fascinating conclusions. Therefore, even if some might underrate web archives, it surely does not diminish their utility to academia.

In the eve of the Brexit referendum, which sparked many debates and discussions in British web space, timely creation of a web collection has proven to be very important – after all, the decision is likely to have long-term consequences for our society, economy, and legal system. Traditionally, individual narratives and civic engagement are set aside when documenting major political decisions. However, a web collection can significantly improve this situation by collecting diverse standpoints expressed in the web sphere. This, in my opinion, perfectly mirrors the ethos of direct democracy where every vote and view counts.

However, important as it is, web archiving comes with a range of practical and ethical obstacles: with huge masses of information being stored online it is very hard to choose what is worthy of being preserved for future generations. Legal restrictions, such as the recent legal deposit legislation, also significantly limit the scope of archivists’ work. During my micro-internship I, along with other interns, tried to overcome these obstacles as much as possible, minimising bias and efficiently using our time resources and server memory. Even in the era of technology, it is the human resources and individual judgment that shape the scope and direction of the collection.

Working on a web collection, especially since the campaigning has increased just before the referendum, was very challenging. However, as interns, we tackled the masses of information by focusing on individual areas of knowledge. Our work on the project was also aided by the guidance provided by our supervisors and discussions on ethical and scientific implications of our research. This was a very rewarding insight into a new area of knowledge, and I am convinced that skills and knowledge acquired and applied by me during the internship will aid me in my future research career.

Anna Lukina

What has web archiving ever done for us? – Saving our dinner plans, for example.

The Bodleian Libraries is involved in web archiving both through the Bodleian Libraries’ own web archive since 2011 , and – as one of the six UK Legal Deposit Libraries – through the Legal Deposit UK Web Archive since 2013.

What’s cooking in the web archives?   —  (Detail from painting by Jean-François Millet [Public domain], via Wikimedia Commons)

A considerable amount of archivists’, curators’ and subject librarians’ time goes into this web archiving work, be it selecting websites for archiving, capturing and preserving web content, describing web archive resources or participating in web archiving strategy, collections management and outreach activities.

Current web archiving projects at the Bodleian include the further development of the Bodleian Libraries Web Archive, for example to capture audio files hosted on web servers, and curatorial work in the UK Web Archive context, such as the Easter Rising 1916 Web Archive and the EU Referendum website collection.

But why archive the web?

What’s on the internet will be there forever, won’t it? Haven’t we all be warned to be careful what we put on the internet, because all the information out there will still reveal awkward details of our first-year-at-university life when we are about to retire?

Unfortunately, for archivists, this is far from what really happens. In fact, websites are extremely ephemeral. They change and disappear at a fast rate.

Continue reading