Tag Archives: conference

UK Web Archive mini-conference 2020

On Wednesday 19th November I attended the UK Web Archive (UKWA) mini-conference 2020, my first conference as a Graduate Trainee Digital Archivist. It was hosted by Jason Webber, Engagement Manager at the UKWA and, as normal in these COVID times, it was hosted on Zoom (my first ever Zoom experience!)

The conference started with an introduction and demonstration of the UKWA by Jason Webber. Starting in 2005 the UKWA’s mission is to collect the entire UK webspace, at least once per year, and preserve the websites for future generations. As part of my traineeship I have used the UKWA but it was interesting to hear about the other functions and collections it provides. Along with being able to browse different versions of UK websites it also includes over 100 curated collections on themes ranging from Food to Brexit to Online Enthusiast Communities in the UK. It also features the SHINE tool, which was developed as part of the ‘Big UK Data Arts and Humanities’ project and contains over 3.5 billion items which have been full-text indexed so that every word is searchable. It allows users to perform searches and trend analysis on subjects over a huge range of websites, all you need to use this tool is a bit a Python knowledge. My Python knowledge is a bit basic but Caio Mello, during his researcher talk, provided a useful link for online python tutorials aimed at historians to aid in their research.

In his talk, Caio Mello (School of Advanced Study, University of London) discussed how he used the SHINE tool as part of his work for the CLEOPATRA Project. He was specifically looking at the Olympic legacy of the 2012 Olympics, how it was defined and how the view of the legacy changed over time. He explained the process he used to extract the information and the ways the information can be used for analysis, visualisation and context. My background is in mathematics and the concept of ‘Big Data’ came up frequently during my studies so it was fascinating to see how it can be used in a research project and how the UKWA is enabling research to be conducted over such a wide range of subjects.

The next researcher talk by Liam Markey (University of Liverpool and the British Library) showed a different approach to using the UKWA for his research project into how Remembrance in 20th Century Britain has changed. He explained how he conducted an analysis of archived newspaper articles, using specific search terms, to identify articles that focused on commemoration which he could then use to examine how the attitudes changed over time. The UKWA enabled him to find websites that focused on the war and compare these with mainstream newspapers to see how these differ.

The Keynote speaker was Paul Gooding (University of Glasgow) and was about the use and users of Non-Print Legal Deposit Libraries. His research as part of the Digital Library Futures Project, with the Bodleian Libraries and Cambridge University Library as case study partners, looked at how Academic Deposit libraries were impacted by e-Legal Deposit. It was an interesting discussion around some of the issues of the system, such as balancing the commercial rights with access for users and how highly restrictive access conditions are at odds with more recent legislation, such as the provision for disabled users and 2014 copyright exception for data and text mining for non-commercial uses.

Being new to the digital archiving world, my first conference was a great introduction to web archiving and provided context to the work I am doing. Thank you to the organisers and speakers for giving me insight into a few of the different ways the web archive is used and I have come away with a greater understanding of the scope and importance of digital archiving (as well as a list of blog posts and tutorials to delve into!)

Some Useful Links:

https://www.webarchive.org.uk/

https://programminghistorian.org/

https://blogs.bl.uk/webarchive/2020/11/how-remembrance-day-has-changed.html

http://cleopatra-project.eu/

 

#WeMissiPRES: Preserving social media and boiling 1.04 x 10^16 kettles

This year the annual iPRES digital preservation conference was understandably postponed and in its place the community hosted a 3-day Zoom conference called #WeMissiPRES. As two of the Bodleian Libraries’ Graduate Trainee Digital Archivists, Simon and I were in attendance and blogged about our experiences. This post contains some of my highlights.

The conference kicked off with a keynote by Geert Lovink. Geert is the founding director of the Institute of Network Cultures and the author of several books on critical Internet studies. His talk was wide-ranging and covered topics from the rise of so-called ‘Zoom fatigue’ (I guarantee you know this feeling by now) to how social media platforms affect all aspects of contemporary life, often in negative ways. Geert highlighted the importance of preserving social media in order to allow future generations to be able to understand the present historical moment. However, this is a complicated area of digital preservation because archiving social media presents a host of ethical and technical challenges. For instance, how do we accurately capture the experience of using social media when the content displayed to you is largely dictated by an algorithm that is not made public for us to replicate?

After the keynote I attended a series of talks about the ARCHIVER project. João Fernandes from CERN explained that the goal of this project is to improve archiving and digital preservation services for scientific and research data. Preservation solutions for this type of data need to be cost-effective, scalable, and capable of ingesting amounts of data within the petabyte range. There were several further talks from companies who are submitting to the design phase of this project, including Matthew Addis from Arkivum. Matthew’s talk focused on the ways that digital preservation can be conducted on the industrial scale required to meet the brief and explained that Arkivum is collaborating with Google to achieve this, because Google’s cloud infrastructure can be leveraged for petabyte-scale storage. He also noted that while the marriage of preserved content with robust metadata is important in any digital preservation context, it is essential for repositories dealing with very complex scientific data.

In the afternoon I attended a range of talks that addressed new standards and technologies in digital preservation. Linas Cepinskas (Data Archiving and Networked Services (DANS)) spoke about a self-assessment tool for the FAIR principles, which is designed to assess whether data is Findable, Accessible, Interoperable and Reusable. Later, Barbara Sierman (DigitalPreservation.nl) and Ingrid Dillo (DANS) spoke about TRUST, a new set of guiding principles that are designed to map well with FAIR and assess the reliability of data repositories. Antonio Guillermo Martinez (LIBNOVA) gave a talk about his research into Artificial Intelligence and machine learning applied to digital preservation. Through case studies, he identified that AI is especially good at tasks such as anomaly detection and automatic metadata generation. However, he found that regardless of how well the AI performs, it needs to generate better explanations for its decisions, because it’s hard for human beings to build trust in automated decisions that we find opaque.

Paul Stokes from Jisc3C gave a talk on calculating the carbon costs of digital curation and unfortunately concluded that not much research has been done in this area. The need to improve the environmental sustainability of all human activity could not be more pressing and digital preservation is no exception, as approximately 3% of the world’s electricity is used by data centres. Paul also offered the statistic that enough power is consumed by data centres worldwide to boil 10,400,000,000,000,000 kettles – which is the most important digital preservation metric I can think of.

This conference was challenging and eye-opening because it gave me an insight into (complicated!) areas of digital preservation that I was not familiar with, particularly surrounding the challenges of preserving large quantities of scientific and research data. I’m very grateful to the speakers for sharing their research and to the organisers, who did a fantastic job of bringing the community together to bridge the gap between 2019 and 2021!

#WeMissiPRES: A Bridge from 2019 to 2021

Every year, the international digital preservation community meets for the iPRES conference, an opportunity for practitioners to exchange knowledge and showcase the latest developments in the field. With the 2020 conference unable to take place due to the global pandemic, digital preservation professionals instead gathered online for #WeMissiPRES to ensure that the global community remained connected. Our graduate trainee digital archivist Simon Mackley attended the first day of the event; in this blog post he reflects on some of the highlights of the talks and what they tell us about the state of the field.

How do you keep the global digital preservation community connected when international conferences are not possible? This was the challenge faced by the organisers of #WeMissIPres, a three-day online event hosted by the Digital Preservation Coalition. Conceived as a festival of digital preservation, the aim was not to try and replicate the regular iPRES conference in an online format, but instead to serve as a bridge for the digital preservation community, connecting the efforts of 2019 with the plans for 2021.

As might be expected, the impact of the pandemic loomed large in many of the talks. Caylin Smith (Cambridge University Library) and Sara Day Thomson (University of Edinburgh) for instance gave a fascinating paper on the challenge of rapidly collecting institutional responses to coronavirus, focusing on the development of new workflows and streamlined processes. The difficulties of working from home, the requirements of remote access to resources, and the need to move training online likewise proved to be recurrent themes throughout the day. As someone whose own experience of digital preservation has been heavily shaped by the pandemic (I began my traineeship at the start of lockdown!) it was really useful to hear how colleagues in other institutions have risen to these challenges.

I was also struck by the different ways in which responses to the crisis have strengthened digital preservation efforts. Lynn Bruce and Eve Wright (National Records of Scotland) noted for instance that the experience of the pandemic has led to increased appreciation of the value of web-archiving from stakeholders, as the need to capture rapidly-changing content has become more apparent. Similarly, Natalie Harrower (Digital Repository of Ireland) made the excellent point that the crisis had not only highlighted the urgent need for the sharing of medical research data, but also the need to preserve it: Coronavirus data may one day prove essential to fighting a future pandemic, and so there is therefore a moral imperative for us to ensure that it is preserved.

As our keynote speaker Geert Lovink (Institute of Network Cultures) reminded us, the events of the past year have been momentous quite apart from the pandemic, with issues such as the distorting impacts of social media on society, the climate emergency, and global demands for racial justice all having risen to the forefront of society. It was great therefore to see the role of digital preservation in these challenges being addressed in many of the panel sessions. A personal highlight for me was the presentation by Daniel Steinmeier (KB National Library of the Netherlands) on diversity and digital preservation. Steinmeier stressed that in order for diversity efforts to be successful, institutions needed to commit to continuing programmes of inclusion rather than one-off actions, with the communities concerned actively included in the archiving process.

So what challenges can we expect from the year ahead? Perhaps more than ever, this year this has been a difficult question to answer. Nonetheless, a key theme that struck me from many of the discussions was that the growing challenge of archiving social media platforms was matched only by the increasing need to preserve the content hosted on them. As Zefi Kavvadia (International Institute of Social History) noted, many social media platforms actively resist archiving; even when preservation is possible, curators are faced with a dilemma between capturing user experiences and capturing platform data. Navigating this challenge will surely be a major priority for the profession going forward.

While perhaps no substitute for meeting in person, #WeMissiPRES nonetheless succeeded in bringing the international digital preservation community together in a shared celebration of the progress being made in the field, successfully bridging the gap between 2019 and 2021, and laying the foundations for next year’s conference.

 

#WeMissiPRES was held online from 22nd-24th September 2020. For more information, and for recordings of the talks and panel sessions, see the event page on the DPC website.

WARC Files and Blue Lagoons: The IIPC Web Archiving Conference, 13-15 April 2016 in Reykjavik

The International Internet Preservation Consortium (IIPC) is the leading international organisation dedicated to improving the tools, standards and best practices of web archiving, promoting international collaboration and the broad access and use of web archives for research and as cultural heritage.

logoThis year, for the first time the IIPC’s annual General Assembly in Reykjavik was accompanied by a three-day conference, bringing together web archivists, curators, IT specialists and researchers to discuss challenges related to acquiring, preserving, making available and using web archives.  With over 150 participants, including leading experts – most prominently the internet pioneer Vint Cerf – the conference provided a unique opportunity to learn about web archiving strategies and projects around the world, and to keep up to date with emerging trends in research and latest technological developments.

Vint Cerf, Avoiding a Digital Dark Age

Vint Cerf, Avoiding a Digital Dark Age

The first day, after a warm welcome by Ingibjörk Sverrisdottir, Iceland’s National Librarian, was dedicated to the ‘big questions’ of web archiving: What’s worth saving? (Hjalmar Gislason) and how to avoid a Digital Dark Age? (Vint Cerf). How might new services look like, which tools and strategies for preservation are available (Emulation!), or being developed? Or, in the words of Brewster Kahle, founder of the Internet Archive: ’20 years of Web Archiving – What do we do now?’ (video of his talk introducing the ‘National Library of Atlantis’ prototype for integrated web archive discovery)

Brewster Kahle, What Do We Do Now?

Brewster Kahle, What Do We Do Now?

On the second day, the conference continued with two separate tracks, discussing either policies, practices and strategies for capture and preservation of web material, or looking more at the user side of web archives, and at how web archive data be accessed, searched, analysed and visualised as a resource for research.
The third day was the hands-on day with workshops exploring search interfaces such as the SHINE interface developed at the British Library for the UK Web Archive,  DIY web archiving tools such as webrecorder.io, the open-source platform Warcbase for analysing web archive data, and discussing the future of the WARC archive format.

There was plenty of time for Q&A and discussions between and after the talks and presentations, and open, friendly atmosphere of the conference encouraged informal conversations with web archiving colleagues and networking during coffee and lunch breaks, and on visits like the tour of the National and University Library of Iceland.

The National and University Library of Iceland

The National and University Library of Iceland

Once again it became clear that web archiving practice is at the same time extremely diverse and depending on joint efforts and collaborations:
For example, the priorities in curating a relatively small collection of Electronic Literature at the German Literary Archive Marbach are very different from these in capturing and preserving the .EU domain at the Portuguese National Foundation for Scientific Computing FCCN, owing the scope, size and structure of the collections, and the resources available to build and maintain them. Similarly, quality assurance policies and workflows differ considerably between national domain scale archives, such as the Legal Deposit UK Web Archive containing millions of websites, and specialized archives curated and captured by university libraries like the North Carolina State University. Researchers approach the UK Government Web Archive with different research questions than those they would use to look at archived Twitter data.

But no matter the size and scope of the web archive, the resources available at a web archiving institution, or the focus of a particular project, the underlying challenges are very similar:

  • How do we decide what to capture?
  • How to capture it?
  • How to preserve it for the future?
  • Metadata?
  • How to provide access and facilitate discovery?
  • How to use web archives for research?

Working collaboratively and across disciplines, including perspectives from archivists, curators, IT engineers and researchers seems to be the best way forward, and the practice of sharing knowledge and experience, and to openly discuss problems gets certainly embraced by the web archiving community. A particular project might have ‘failed’ in terms of achieving the intended outcome, but it can still provide valuable lessons for the next project elsewhere, and in the long run, for developing best practice, policies and standards for web archiving as a discipline.

Mistakes are only wrong if you - and others - don't learn from them!

Mistakes are only wrong if you – and others – don’t learn from them!

Curators might be slightly overwhelmed by technical details discussed by web crawl engineers (I certainly was!) and ‘the IT guys (and girls)’ might sometimes be confused by the curatorial way of thinking; web archiving cultures in North America seem to differ considerably from the approaches in Europe, where Legal Deposit regulations have a strong impact on collection strategies and access to archives. STEM researchers look at data in different ways than historians and social scientists.
International conferences like the IIPC Web Archiving Conference 2016 are invaluable for bringing together these different perspectives, for fostering discussion and knowledge sharing and for providing an opportunity to establish new and strengthening existing contacts with web archiving colleagues in archives, (university) libraries and research institutions worldwide.

Archiving social media...

Harvesting social media: Overview…

 

...the details.

…and details.

Web archivists love to produce new social media content:
The conference seen through the participants’ Tweets: #iipcwac16.
(Now we just have to archive that!
)

Not least, the Reykjavik conference provided a rare opportunity to meet web archiving colleagues from other UK Legal Deposit Libraries outside the usual committees and institutional settings. One of the conference lunch breaks was turned into an ad-hoc UK Legal Deposit Web Archive meeting, discussing user interface redevelopment – and where else but in Iceland can you have a Friday late afternoon conference debrief whilst soaking in a giant outdoor geothermal bathtub (aka the Blue Lagoon)?

UK web archivists after conference debrief

Some very clean UK web archivists after the conference debrief

 

 

Catching butterflies

Archival Uncertainties: International Conference on Literary Archives at the British Library – 4 April 2016

This one-day conference focused on digital humanities, with papers from a spectrum of interested parties including academics working on digitisation projects, authors, translators, archivists and curators. I attended three panels on the day and the unifying theme was a contrary message of dispersal and amalgamation (and butterflies).

The first thing that has been dispersed or discarded is any idea of a literary canon. As plenary speaker and archivist Catherine Hobbs pointed out, scholarship now focuses less on established set texts and more on themes like “environmental literature”. Over the past few decades, in response to this, archives have collected more non-traditionally canonical literary papers but, Catherine reminded us, as archivists we can’t stop paying attention to the ways that literature continues to change. We need to keep tabs on what is going on in the literary world in order to document it, and this will include tackling new forms of experimental, avant-garde and self-published writing.

Caterpillar: Schwalbenschwanz (Raupe)

Caterpillars and collection development [By Eric Steinert – photo taken by Eric Steinert at Paussac, France, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=338409]

As Catherine noted, it used to be easy to find the avant-garde – pretty much whoever was hanging out on the Left Bank – but now it’s up to archivists to not only collect this material, but to track it down in the first place, and not to default to the temptingly easy path of collecting only the papers of that tiny sliver of authors considered publishable by mainstream publishers.

Continue reading

Web Archives as Scholarly Sources: Issues, Practices and Perspectives

RESAW conference in Aarhus, 8-10 June 2015

Web archiving has been part of Special Collections work at the Bodleian Library for quite a while now, both in cooperation with other UK Legal Deposit Libraries within the electronic Legal Deposit framework (since 2013) , and through the Bodleian Libraries’ own Web Archive.
But whereas the amount of archived web material – at the Bodleian and elsewhere – is constantly growing, the usage of these new resources has so far been quite low, with, it seems, scholars being largely unaware of the potential web archives have as sources for research or lacking knowledge and skills of how to work with such material, and web archiving institutions lacking resources to promote their web archive collection and support their use.

The Research Infrastructure for the Study of Archived Web Materials (RESAW) network aims to promote the establishing of a collaborative European research infrastructure for the study of archived web materials. This means collaborating internationally as well as interdisciplinary to meet the challenges – and the opportunities – archived web materials bring to develop new methods and approaches in research and teaching.

DSC00997

One of the topics: How to archive Social Media content?  And how to use archived Social Media content as scholarly sources?

Tweets from the conference have been collected via Storify. Thanks to Jane Winters from the Institute for Historical Research, University of London, for having set this up.  

The 2015 RESAW conference, hosted by the University of Aarhus in Denmark, was the third in a series of conferences: the first conference in 2001 focused on how to preserve web content, the second in 2008 on web archives theory, and this year’s third conference on the actual use of web archives in research.
Participants included over 80 web archivists, curators, researchers, and IT experts  from various disciplines  from Canada, Denmark, Finland, France, Germany, Italy, Israel, the Netherlands, Russia, the UK, the United Arabic Emirates, and the USA, representing public and private archives, state and university libraries, research institutions, IT service providers and web archiving consultants.

For an intense three days, keynote speeches, and short and long papers alternated panel discussions, with speakers and presenters reporting on their approaches to and practical experiences in archiving websites and in using archived web material for research.
Whereas the individual case studies came from very different backgrounds – focusing on YouTube or social media, exploring possible new tools and methodologies for web archiving and web archives analysis, dealing with the use of Big Data or small datasets in research disciplines from anthropology and linguistics to international relations and migration studies, looking at academic websites, popular culture, internet governance, citizen involvement and even troll communities – it soon became clear that the individual results would lead to common conclusions:

Archived web materials are ‘different’ from both traditional paper-based resources and from the live web. Therefore, existing research theories and traditional approaches to collecting and curating are often not useful when dealing with web materials; new methodologies need to be developed, new questions to be asked. On a practical basis, there is a big need for new tools to deal with the sheer amount of data available for research,  for example to filter and analyze web archive collections, and to visualize results.
Archiving web materials, curating collections, and using them as scholarly sources requires a great amount of resources  – staff/time, knowledge and expertise, technical infrastructure and tools. To use the existing resources as efficiently as possible, archivists, curators, researches of different disciplines, IT experts and service providers need to collaborate.  Pooling resources across institutions and creating (international) networks to share knowledge and experience seems to be the way forward.

Anna Perricci, Columbia University, on the importance of building web archiving collaborations

Anna Perricci, Columbia University, on the importance of building web archiving collaborations

Communication and openness are key! Archivists and curators should make web archiving processes transparent and explain to scholars what type of material and information they can realistically expect to find in web archives (and what is likely not to be included!).  Researchers should clearly express their needs and expectations, but at the same time, be willing to engage with a new type of resource, requiring new approaches, and at least basic IT skills. IT experts should develop easy to use and transparent tools, and share technical knowledge that helps to interpret archived web materials. Users should feed their experience back to curators and developers to help improve web archives selection, metadata/description and discovery tools.

Web archiving is still a young discipline – and research based on archived web material is an even younger one. There are no golden ‘how to’ rules, standards or ‘ultimate authorities’ yet, everyone is still learning. Individual projects encountering problems, or even ‘failing’ to achieve the desired outcome, can still provide valuable lessons to learn from for others. Successes, e.g. in developing and using methodologies and tools for web archiving and using web archives, can be the starting point for developing best practice guidelines in the medium to long term. Again, this requires communication and collaboration within and across institutions, professions, disciplines and countries.

Gareth Millward sharing his experience from the BUDDAH project

A case study of using Web archives as scholarly sources: Gareth Millward sharing his experience exploring the evolution of  disability organisation websites through the UK Domain Data Archive.

The conference’s big strength, apart from giving web archiving professionals and web archives users the opportunity to present their recent and ongoing projects and – in many cases – asking the other conference participants for input and advice, was certainly to bring together people concerned with web archives from a great variety of backgrounds, thus enabling exchange of ideas, debate and networking. There were many eye-opening moments in terms of discovering someone else, in a different institution in a different country, has been working on similar topics or encountered similar problems.

Knowing how and with which result web archived materials were used in other institutions will be very valuable if and when the Bodleian Libraries decide to promote their own web archive collections. At the same time, getting in touch with web archiving colleagues in the UK and internationally offers much potential for collaborations in future projects.
For example, the Tomsk State University in Russian is currently trying to establish a web archive similar to the Bodleian Libraries’ Web Archives, whilst research projects run at the Institute for Historical Research of the University of London  as part of the Big UK Domain Data for the Arts and Humanities Project in cooperation with the British Library could be used as examples to promote the scholarly use of the UK (Legal Deposit) Web Archive in Oxford.

Special Collections in the Danish Netarkivet

Special Collections in the Danish Web Archive, which is run by the State and University Library in Aarhus and The Royal Libray in Copenhagen. Since 2005 the collection and preservation of the .dk internet is included in the Danish Legal Deposit Law.

At the end of the conference, everyone was buzzing with enthusiasm and new ideas, and agreed that the event was a great success  – not least to the flawless organisation and wonderful Danish hospitality, which included a reception celebrating the anniversary of the Danish Web Archiv netarkivet.dk, lots of Smørrebrød (delicious Danish open sandwiches) and a memorable conference dinner, all adding to the friendly and sociable character of the event.

A similar conference is now envisaged to be held in 2016 or 2017 in London, an opportunity not to be missed to catch up with the latest in Web Archiving and strengthen old and new – forgive the pun – links!

Balisage 2010 The Markup Conference

Balisage 2010 The Markup Conference was
preceded by the International Symposium on XML for the Long Haul Issues in the Long-term Preservation of XML which opened with:

A brief history of markup of social science data: from punched cards to “the life cycle” approach covering the “25-year process of historical evolution leading to DDI, the Data Documentation Initiative, which unites several levels of metadata in one emerging standard.”

Sustainability of linguistic resources revisited looked at some of the difficulties facing language resources over the long-term.

Report from the field: PubMed Central, an XML-based archive of life science journal articles provided insight into the processes deployed to give public access to the full text of more than two million articles.

Portico: A case study in the use of XML for the long-term preservation of digital artifacts discussed some practices that can help assure the semantic stability of digital assets.

The Sustainability of the Scholarly Edition in a Digital World explored the need for “ tools to make XML encoding easier, to encourage collaboration, to exploit social media, and to separate transcriptions of texts from the editorial scholarship applied to
them”.

A formal approach to XML semantics: implications for archive standards examined whether “The application of Montague semantics to markup languages may make it possible to distinguish vocabularies that can last from those which will not last”.

Metadata for long term preservation of product data discussed the “valuable lessons to be learned from the library metadata and packaging standards and how they relate to product metadata”.

The day concluded with Beyond eighteen wheels: Considerations in archiving documents represented using the Extensible Markup Language (XML) which contemplated “strategies for extending the useful life of archived documents”.

Sessions in the main conference 2010 – covered topics such as :

gXML, a new approach to cultivating XML trees in Java which proposed “A single unified Java-based API, gXML, can provide a programming platform for all tree models for which a “bridge” has been developed. gXML exploits the Handle/Body design pattern and supports the XQuery Data Model (XDM)”.

Java integration of XQuery — an information unit oriented approach explored “a novel pattern of cooperation between XQuery and Java developer? A new API, XQJPLUS, makes it possible to let XQuery build “information units” collected into “information trays”.

XML pipeline processing in the browser discussed the benefits that providing XProc as a Javascript-based implementation would offer comprehensive client-side portability for XML pipelines specified in XProc.

Where XForms meets the glass: Bridging between data and interaction design explored using XForms which offers a model-view framework for XML whilst working within the conventions of existing Ajax frameworks such as Dojo as a way to bridge differing development approaches,data-centric versus starting from the user interface .

A packaging system for EXPath demonstrated how to adapt conventional ideas of packaging to work well in the EXPath environment. “EXPath provides a framework for collaborative community-based development of extensions to XPath and XPath-based technologies (including XSLT and Xquery)”.

A streaming XSLT processor Michael Kay (editor of the XSLT 2.1 specification) showed how he has been implementing streaming features in his Saxon XSLT processor;

Processing arbitrarily large XML using a persistent DOM covered moving the DOM out of memory and into persistent storage offering another processing option for large documents, by utilising, an efficient binary representation of the XML document that has been developed, with a supporting Java API.

Scripting documents with XQuery: virtual documents in TNTBase presented a virtual-document facility integrated into TNTBase, an XML database with support for versioning. The virtual documents can be edited, and changes to elements in the underlying XML repository are propagated automatically back to the database.

XQuery design patterns illustrated the benefits that might extend from the application of meta design patterns to Xquery.

-Renhart Gittens