Digital Preservation: What I Wish I Knew Before I Started

Last week I attended a student conference on digital preservation, hosted by the Digital Preservation Coalition. The event was called ‘What I Wish I Knew Before I Started’, and several digital preservationists gave some very interesting insights into the skills and challenges of digital preservation. The basic message was that digital preservation is a big task which requires urgent action, but that archivists already have many of the skills needed to carry it out. I’ve posted about it at the futureArch blog, if anyone is interested in finding out more.

4 comments on “Digital Preservation: What I Wish I Knew Before I Started

  1. Hi Rebecca, thanks for your interesting post. I was wondering about the process of archiving a website: do you archive all the files onto your own server? And if so, how connected are you to the actual website – as in, are you only archiving oxford uni ones, or do you ask other people nicely if you can have their files. Basically, how does it work? Ta, Laurence.

  2. Hi Laurence, thanks for your question! We make use of a service provided by the Internet Archive called Archive-It ( http://www.archive-it.org/ ) to crawl the websites, making a copy which is stored on their servers.

    We do archive Oxford university websites, but we also archive websites that are related to existing collections within the Bodleian, providing we can get permission off the website owners. If you look at our Archive-It page ( http://www.archive-it.org/organizations/467 ) you can see which sites we’ve archived.

    I hope that makes sense! Let me know if you have any more questions.

  3. So you’re not so much copying the files themselves as how they are rendered by the web browser? Can you do that and still keep all the links working? Fancy. And what about a website that has a database behind it?… Maybe we should continue this in more detail at the pub…

  4. Yes, all the links work. Basically we’re gathering the html code and all the other bits of information that make up a website (images, videos, flash, audio content, etc.) so that Archive-It can display it as an actual website. So all of the links within the website work as normal (or should do, any way). What’s really good is that if a site links to another website we archive, Archive-It can make the connection, so it works a bit like a mini-internet of archived sites almost (not a technical description).

    As for databases, I’m not entirely sure, but I should think that if it can be crawled it ought to work. I’ll try to find out! I can report back at the pub…!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.