Tag Archives: audio

Media players and the reader interface…

This is quite a long post, so I’m going to put the final line at the top too in case you don’t read that far… 😉

Your thoughts on media players would be most welcome!

The trouble, um, I mean, beauty of digital collections is that they redefine what a “manuscript” is. This is nothing new. Once upon a time someone somewhere probably upset the apple cart when they arrived at the hallowed doors with a basket full of photographs. Now we have video, audio and images, all of which can be encoded in any number of “standard” ways. (Not to mention a zillion different binary formats for just about any purpose you can imagine from sheet music to the latest car designs, which may well require more than just document-like presentation too – 3D models for example). These new manuscripts bring challenges for preservation, of course, but they also present challenges for presentation.

To address this, I’ve been learning more about media players in browsers with a view to picking one for the reader interface. I’m no expert in this field, so here is my layman’s consideration of what I’ve found out and if you want to read more then this is great!

The traditional method to render audio/video in browsers, which pre-dates their ability to handle video themselves, is to use a browser plug-in, either directly (for example VLC plugin) or (more commonly) to build on top of Flash (eg. Flowplayer) or Java (eg. Cortado). The exact mark-up required to use these players varies. Some will simply use the “embed” tag and others have JavaScript libraries to simplify their usage and allow for graceful degradation in the event that the browser does not have the correct plug-in and/or understand/run JavaScript. (This may be an issue when we deploy the interface into a reading room with machines we do not control the configuration of).

But the times, they are a-changin’. Just as old browsers knew what to do when presented with an “img” tag, most modern browsers are beginning to support HTML5’s “video” and “audio” tags, allowing the browser itself to handle the playback rather than farming this out to a plug-in. (For more on HTML5 generally see this presentation – the video tag is mentioned at about 58 minutes in). As an added bonus of bringing video into the browser in this way is it has inspired folks to build media players that manipulate the Web page to add the correct mark-up, be it a video tag, an embed, or whatever to play the media. This is currently being used to generate some nice media players that’ll use the browser, the Flash-plugin, or whatever is available (see OpenVideoPlayer and OSMPlayer).

So now we get to the crux of it. What should we do for the reader interface? Go old-school (and annoy Steve Jobs) and use a Flash-based player? Adopt the new ways of HTML5? Insist on an Open Source player? Buy something in?

To work out the answer I did a bit of investigating and have installed most of the players mentioned thus far in this post – Flowplayer, OSMPlayer, video-tag only, VLC and Cortado, as well as JWPlayer.

Flowplayer uses the Flash-plugin to play Flash video (and, with an additional plug-in, MP3 audio) – it does not support Ogg. It is very simple to use and very slick to look at. It is open source, released under GPL3 with an additional (and reasonable) “attribution clause” which basically means the Flowplayer logo must appear on the player unless you pay extra.

JWPlayer works much like Flowplayer (though there is also a beta HTML5 video player in the making) and seems pretty good. While the source code is available, it is not clear if this is an open source product or otherwise – the source files do not include a LICENSE.txt or any boilerplate. Probably I’m just missing something there though, and JWPlayer seems a good choice if you don’t mind Flash.

OSMPlayer is also open source and has numerous options for installation including a Drupal module (untested), a PHP library and a “stand-alone” configuration. In theory it supports lots of different audio and video formats and uses several divs to create a nice browser based player. Unfortunately, following the guidelines for both PHP and stand-alone configurations, I could not get it to work on my test server.

Video-tag only works pretty well with Firefox 3.6 on Ubuntu 10.04 and is very easy to include in a Web-page. Unfortunately it isn’t nearly as slick at playback as Flowplayer – there is a delay in starting the video and it is unclear what is going on.

The VLC plug-in is also open source and seems to work pretty well and should be able to handle many different formats, but it isn’t nearly as refined as other players and the provided example code fails to stop the video or make it full-screen. The VLC desktop player is wonderful, but I’m not convinced by the Firefox plug-in.

Cortado is a Java-applet provided to play Ogg Theora among other things. Usage is very simple – you just add an applet tag to the page – but playback is jerky, slow and lacked sound. I do not know if my machine is to blame for this or if it is the player itself so will have to investigate further.

Were I sat on and forced to make a choice I think I’d struggle. Flowplayer is slick to use and easy to implement, but requires we convert everything to Flash video or MP3 (mind you, most media will arrive in suitable formats I imagine). JWPlayer is very similar in this regard. I’d like to adopt the video-tag as this supports a wide range of formats, including open ones, but currently the experience is not very smooth and refinements in this area provided by things like OSMPlayer are still in their early stages of development. JWPlayer’s HTML5 offering is still beta for example.

I guess my feeling for now is to either go with Flowplayer (and swallow the conversions required – actually pretty easy with ffmpeg) or spend a bit of time with OpenVideoPlayer’s HTML5 work and the video tag. At this stage I think we probably need both working in the interface and see where the better user experience is…

I should throw one more thing into the pot – the problem of formats. Video and audio files are complicated beasts consisting of containers and tracks and such – a bit like cassettes! The contents of these containers are encoded in a variety of ways, each requiring different software to decode and render their content. We have the same problem with documents and we solve that by converting all the text-based materials we get into PDFs (for presentation before anyone starts worrying about the preservation implications of PDF!) and use a PDF plug-in to display them.

Can we do the same with our audio/video material and if we can, what format (I’m using “format” as a general term to mean “container/encoding”!) do we use? (Victoria has already done some work along these lines, creating WAVs for storage and MP3s for presentation, from audio CDs). Is there any additional concerns given that most born-digital video/audio is likely to arrive at our doors in a compressed format? Should we uncompress it? Is such a thing even possible? Should we (and do we have the processing power to) convert all audio/video materials to open formats for both preservation and presentation purposes?

We’re going to raise this final question at our next Library developer meeting and see what folks think. In theory we can delay the decision because most browsers and their plug-ins handle multiple formats, but perhaps we should have a standard delivery format much like we currently have PDF?

Oh dear. I started writing this post with the hope of finding all the answers! I have found out a lot about media players at least, which can only be a good thing, and I’ve also found out that that state of the art is not quite as far along as the proponents of HTML5 killing Flash would like us to believe – though there is good work going on here and this is the future. I’m also unclear just how much my experience of these things is hindered by using Ubuntu – I often wrestle with the playback of media files under Linux! 🙂

Still, I think we’re further along, nearer an answer and at least in a place to know where to start testing…

Your thoughts on media players would be most welcome! 🙂

-Peter Cliff

Odds and ends from day one of the digital lives conference

The digital lives conference provided a space to digest some of the findings of the AHRC-funded digital lives project, and also to bring together other perspectives on the topic of personal digital archives. At the proposal stage, the conference was scheduled to last just a day; in the event one day came to be three, which demonstrates how much there is to say on the subject.

Day one was titled ‘Digital Lifelines: Practicalities, Professionalities and Potentialities’. This day was intended mostly for institutions that might archive digital lives for research purposes. Cathy Marshall of Microsoft Research gave the opening talk, which explored some personal digital archiving myths on the basis of her experiences interviewing real-life users about their management of personal digital information.

Next came a series of four short talks on ‘aspects of digital curation‘.

  • Cal Lee, of UNC Chapel Hill, emphasised the need for combining professional skills in order to undertake digital curation successfully. Archives and libraries need to have the right combination of skills to be trusted to do this work.
  • Naomi Nelson of MARBL, Emory University, told a tale of two donors. The first donor being the entity that gives/sells an archive to a library and the second being the academic researcher. Libraries need to have a dialogue with donors of the first type about what a digital archive might contain; this goes beyond the ‘files’ that they readily conceive as components of the archive, and includes several kinds of ‘hidden’ data that may be unknown to them. The second donor, ‘the researcher’, becomes a donor by virtue of the information that the research library can collect about their use of an archive. Naomi raised interesting questions about how we might be able to collect this kind of data and make it available to other researchers, perhaps at a time of the original researcher’s choosing.
  • Michael Olson of Stanford University Libraries spoke of their digital collections and programmes of work. Some mention of work on the fundamentals – the digital library architecture (equivalent to our developing Digital Asset Management System – DAMS – which will provide us with resilient storage, object management and tools and services that can be shared with other library applications). Their digital collections include a software collection of some 5000 titles, containing games and other software. I think that sparked some interest from many in the audience!
  • Ludmilla Pollock, Cold Spring Harbour Laboratory, told us about an extensive oral history programme giving rise to much digital data requiring preservation. The collection contains videos of the scientists talking about their memories and has a dedicated interface.

After, we heard from a panel of dealers in archival materials: Gabriel Heaton of Sotheby’s, Julian Rota of Bertram Rota and Joan Winterkorn of Bernard Quaritch. I was curious to hear if the dealers had needed to appraise archives conatining obsolete digital media. Digital material is still only a tiny proportion of collections being appraised by dealers, and it seems that what little digital material they do encounter may not be appraised as such (disk labels are viewed rather than their contents). While paper archives are plentiful, perhaps there’s not much incentive to develop what’s needed to cater for the digital (many archivists may well feel this way too!). What’s certain is that the dealer has to be quite sure that any investment in facilitating the appraisal of digital materials pays dividends come sale time.

Inevitably, questions of value were a feature of the session. The dealers suggest that archives and libraries are not willing to pay for born-digital archives yet; perhaps this stems from concerns about uniqueness and authenticity, and the lack of facilities to preserve, curate and provide access. It’s not like there’s actually much on the market at the moment, so perhaps it’s a matter of supply as much as demand? Comparisons with ‘traditional’ materials were also made using Larkin’s magic/meaningful values:

“All literary manuscripts have two kinds of value: what might be called the magical value and the meaningful value. The magical value is the older and more universal: this is the paper [the writer] wrote on, these are the words as he wrote them, emerging for the first time in this particular magical combination. We may feel inclined to be patronising about this Shelley-plain, Thomas-coloured factor, but it is a potent element in all collecting, and I doubt if any librarian can be a successful manuscript collector unless he responds to it to some extent. The meaningful value is of much more recent origin, and is the degree to which a manuscript helps to enlarge our knowledge and understanding of a writer’s life and work. A manuscript can show the cancellations, the substitutions, the shifting towards the ultimate form and the final meaning. A notebook, simply by being a fixed sequence of pages, can supply evidence of chronology. Unpublished work, unfinished work, even notes towards unwritten work all contribute to our knowledge of a writer’s intentions; his letters and diaries add to what we know of his life and the circumstances in which he wrote.”

Philip Larkin ‘A Neglected Responsibility: Contemporary Literary Manuscripts’, Encounter, July 1979, pp. 33-41.

The ‘meaningful’ aspects of digital archives are apparent enough, but what of the ‘magical’? Most, if not all, contributors to the discussion saw ‘artifactual’ value in digital media that had an obvious personal connection, whether Barack Obama’s Blackberry or J.K. Rowling’s laptop. What wasn’t discussed so much was the potential magical value of seeing a digital manuscript being rendered in its original environment. I find that quite magical, myself. I think more people will come to see it this way in time.

Delegates were then able to visit to digital scriptorium and audiovisual studio at the British Library.

After lunch, we resumed with a view of the ‘Digital Economy and Philosophy‘ from Annamaria Carusi of the Oxford e-Research Centre. Some interesting thoughts about trust and technology, referring back to Plato’s Phaedrus and the misgivings that an oral culture had about writing. New technologies can be disruptive and it takes time for them to be generally accepted and trusted.

Next, four talks under the theme of digital preservation.

  • First an overview of the history of personal films from Luke McKernan, a curator at the British Library. This included changes in use and physical format, up to the current rise of online video populating YouTube, and its even more prolific Chinese equivalents. Luke also talked about ‘lifecasting’, pointing to JenniCam (now a thing of the past, apparently), and also to folk who go so far as to install movement sensors and videos throughout their homes. Yikes!
  • We also heard from the British Library’s digital preservation team, about their work on risk assessment for the Library’s digital collections (if memory serves, about 3% of the CDs they sampled in a recent survey had problems). Their current focus is getting material off vulnerable media and into the Library’s preservation system; this is also a key aim in our first phase of futureArch. Also mention of the Planets and LIFE projects. Between project and permanent posts, the BL have some 14 people working on digital preservation. If you count those working on webarchiving, audiovisual colections, digitisation, born-digital manuscripts, digital legal deposit, etc., areas, who also have a knowledge of this area, it’s probably rather more.
  • William Prentice offered an enjoyable presentation on audio archiving, which had some similar features to Luke’s talk on film. It always strikes me that audiovisual archiving is very similar to digital archiving in many respects, especially when there’s a need to do digital archaeology that involves older hardware and software that itself requires management.
  • Juan-José Boté of the University of Barcelona spoke to us about a number of projects he had been working on. These were very definitely hybrid archives and interesting for that reason.

Next, I chaired a panel of ‘Practical Experiences‘. Being naturally oriented toward the practical, there was lots for me here.

  • John Blythe, University of North Carolina, spoke about the Southern Historical Collection at the Wilson Library, including the processes they are using for digital collections. Interestingly, they have use of a digital accessioning tool created by their neighbours at Duke University.
  • Erika Farr, Emory University, talked about the digital element of Salman Rushdie’s papers. Interesting to note that there was overlap of data between PCs, where the creator has migrated material from one device to another; this is something we’ve found in digital materials we’ve processed too. I also found Rushdie’s filenaming and foldering conventions curious. When working with personal archives, you come to know the ways people have of doing things. This applies equally to the digital domain – you come to learn the creator’s style of working with the technology.
  • Gabby Redwine of the Harry Ransom Center, University of Texas at Austin gave a good talk about the HRC’s experiences so far. HRC have made some of their collections accessible in the reading room and in exhibition spaces, and are doing some creative things to learn what they can from the process. Like us, they are opting for the locked down laptop approach as an interim means of researcher access to born-digital material.
  • William Snow of Stanford University Libraries spoke to us about SALT, or the Self Archiving Legacy Toolkit. This does some very cool things using semantic technologies, though we would need to look at technologies that can be implemented locally (much of SALT functionality is currently achieved using third-party web services). Stanford are looking to harness creators’ knowledge of their own lives, relationships, and stuff, to add value to their personal archives using SALT. I think we might use it slightly differently, with curators (perhaps mediating creator use, or just processing?) and researchers being the most likely users. I really like the richness in the faceted browser (they are currently using flamenco) – some possibilities for interfaces here. Their use of Freebase for authority control was also interesting; at the Bod, we use The National Register of Archives (NRA) for this and would be reluctant to change all our legacy finding aids and place our trust in such a new service! If the NRA could add some freebase-like functionality, that would be nice. Some other clever stuff too, like term extraction and relationship graphs.

The day concluded with a little discussion, mainly about where digital forensics and legal discovery tools fit into digital archiving. My feeling is that they are useful for capture and exploration. Less so for the work needed around long-term preservation and access.

-Susan Thomas