Semantic data and the stories we’re not telling

One of my earliest memories of television was James Burke’s series Connections. It was fascinating yet accessible: each episode explored technology, history, science and society, jumping across topics based on historical connections or charming coincidences. One episode started with the stone fireplace and ended with Concorde.

In a digital utopia, we would each be our own James Burke, creating and sharing intellectual journeys by following the connections that interest us. We are not there yet. Many very valuable databases exist online, but the connections between them are obscured rather than celebrated, and this is an obstacle for anyone using those data in education or research. In a previous post I described the problems that come from the fact that things have different names in different databases, and described a semantic web approach to link them together.

Building on this approach, web applications can help people create their own stories; choosing their own path through sources of reliable information, building unexpected connections. In this post I describe three design principles behind these applications. Let’s start with a story.

Continue reading

Research Uncovered—The artist sleeps and the audience performs

Book tickets!


What: The artist sleeps and the audience performs

Who: Menaka PP Bora, David de Min, and Sebastiano Ludovico

When: 13:00—14:00, Monday 27 November 2017

Where: Weston Library Lecture Theatre (map)

Access: open to all

Admission: free

Registration is required

Blending technology and performance art for new experiences in viewing Bodleian collections

This performance talk highlights a new way for people to experience and interpret visual arts collections through performance and the latest technology in mobile apps, Velapp, the ‘world’s most natural video editor’. The talk uses Velapp to explore the challenges and opportunities posed by new technology on artistic responses to heritage collections.

During the talk the audience is invited to play with a sample Velapp mobile phone app, learning to shoot film and simultaneously edit while enjoying the performance of items from the Bodleian’s collections. This technological intervention enables members of the audience to produce mobile films while they watch the performance, editing as they continue to film. The experience becomes more entertaining and immersive.

Dr. Menaka PP Bora is a multi- award winning performing artist, choreographer, ethnomusicologist, actor, and broadcaster. Besides touring her sell-out solo shows in the ‘world dance’ scene and regularly appearing as Guest Speaker on BBC Radio, she is Bodleian’s Affiliated Artist and winner of the highly prestigious Leverhulme Early Career Fellowships 2016.

 

David de Min is a Tech Enterpreneur and Founder and CEO of Velapp. David is currently working on one of the most game-changing projects for the UK technology industry which will be very high profile, hugely impact the tech sector/economy, firmly place the UK on the map as a game changer in the tech world and drive phenomenal positive social change across Europe.

 

Sebastiano Ludovico is a talented young Artist and Tech Investor belonging to the Sicilian royal family in Italy. Based in London, Sebastiano exhibited his paintings at solo exhibitions from the age of 5 years. His works of art are particularly appreciated by Hollywood stars and international pop music artists and all funds raised from sales of his work are donated directly to children’s foundations and other charities, in particular the Samuel L. Jackson Foundation with whom he has collaborated with for the last 4 years.

This performance talk is hosted by the Centre for Digital Scholarship as part of the Research Uncovered series of public talks.

Bodleian Student Editions 2017–2018

Please note that these workshops are now fully subscribed for this academic year, 2017–2018. To express an interest in future workshops, please email Pip Willcox.

Textual editing workshops for undergraduates and postgraduates

Elizabeth Wagstaff letter, 2 May 1621

A collaboration between the Bodleian’s Department of Special Collections and Centre for Digital Scholarship, and Cultures of Knowledge, a project based at the Faculty of History

We are looking for enthusiastic undergraduates and postgraduates from any discipline to take part in workshops in textual editing culminating in the publication of a citable transcription.

Join the waiting list: see below for details

After a hugely successful pilot run—from which published transcriptions can be seen here—these workshops are in their second year, and are scheduled to take place on the following dates:

Michaelmas Term 2017

  • 10:00–16:30 Thursday 7th week, 23 November

Hilary Term 2018

  • 10:00–16:30 Wednesday 3rd week, 31 January
  • 10:00–16:30 Thursday 7th week, 1 March

Trinity Term 2018

  • 10:00–16:30 Wednesday 3rd week, 9 May
  • 10:00–16:30 Thursday 7th week, 7 June

Textual editing is the process by which a manuscript reaches its audience in print or digital form. The texts we read in printed books are dependent on the choices of editors across the years, some obscured more than others. The past few years have seen an insurgence in interest in curated media, and the advent of new means of distribution has inspired increasingly charged debates about what is chosen to be edited, by whom and for whom.

These workshops give students the opportunity to examine these questions of research practice in a space designed around the sources at the heart of them. The Bodleian Libraries’ vast collections give students direct access to important ideas free from years of mediation, and to authorial processes in their entirety, while new digital tools allow greater space to showcase the lives of ordinary people who may not feature in traditional narrative history.

Our focus is on letters of the early modern period: a unique, obsolescent medium, by which the ideas which shaped our civilisation were communicated and developed. Participants will study previously unpublished manuscripts from Bodleian collections, working with Bodleian curators and staff of Cultures of Knowledge (http://www.culturesofknowledge.org), to produce a digital transcription, which will be published on the flagship resource site of Cultures of Knowledge, Early Modern Letters Online (http://emlo.bodleian.ox.ac.uk), as ‘Bodleian Student Editions’.

The sessions are standalone, but participants in last year’s workshops have gone on to further transcription work with Bodleian collections and with research projects around the country, as well as producing the first scholarship on some of the manuscripts by incorporating material in their own research (from undergraduate to doctorate level). The first-hand experience with primary sources, and citable transcription, extremely useful for those wishing to apply for postgraduate study in areas where this is valued: one participant last year successfully proceeded from a BA in Biological Sciences to an MA in Early Modern Literature on the basis of having attended.

The sessions provide a hands-on introduction to the following:

  1. Special Collections handling
  2. Palaeography and transcription
  3. Metadata curation, analysis, and input into Early Modern Letters Online
  4. Research and publication ethics
  5. Digital tools for scholarship and further training available

To hear about future textual editing workshops and other events as they are advertised, please join the digital scholarship mailing list.

Participation is open to students registered for any course at the University of Oxford. If you would like to participate or to join the waiting list, please contact Carmen Bohne, Special Collections Administrator, carmen.bohne@bodleian.ox.ac.uk, and include:

  1. your ox.ac.uk email address
  2. your department
  3. your level and year of study
  4. particular access requirements
  5. particular dietary requirements

Please note that registration is only open for Michaelmas term’s workshop. You may register your interest in subsequent workshops: please state the dates on which you are available. Places are limited and will be confirmed for each term’s workshops at the start of that term.

The Bodleian Libraries welcome thoughts and queries from students of all levels on ways in which the use of archival material can facilitate your research. For an idea of the range of collections in the Weston, visit the exhibition Bodleian Treasures: 24 Pairs in the Treasury gallery in Blackwell Hall (http://treasures.bodleian.ox.ac.uk), where some famous items are illuminated through juxtaposition to less known items that prompt reflection on the concept of a treasure. Our next themed exhibition, Designing English, showcasing the graphic design of mediaeval manuscripts in English from Bodleian collections, will open in the ST Lee Gallery on 1 December. For the first two months it will be shown alongside Redesigning the medieval book, a display of contemporary book arts inspired by the exhibition and created as part of a workshop and competition run in collaboration with the English Faculty.

Turning a historical book into a data set

A series of books published around the turn of the 20th century are crucial to modern bibliographic research: they are biographical dictionaries of booksellers and printers, including addresses, dates and significant works printed. Some of these books are out of copyright and available as scanned pages, allowing us not only to copy them into new formats, but adapt them into new kinds of resource.

These scanned books could be made more useful to researchers in a number of ways. Text could be meaningfully segmented, by dictionary entry rather than by page or paragraph. The book’s internal and external citations can become links, for instance linking a proper name to identifiers for the named person. The book can even have an open data representation which other data sets can hook on to, for example to say that a person is described in the book.

This case study describes the transformation of one of these books, Henry Plomer’s A Dictionary of the Booksellers and Printers who Were at Work in England, Scotland and Ireland from 1641 to 1667 using Wikisource, part of the Wikimedia family of sites. As a collaborative platform, Wikisource allowed Bodleian staff to work with Wikisource volunteers. We benefited from many kinds of volunteer labour, from correcting simple errors in the text to creating custom wiki-code to speed up the process.

A lot of important data sets only currently exist in the form of printed books, including catalogues, dictionaries and encyclopedias. We adopted a process that has already been used on some large, multi-volume works and could be used for many more. Continue reading

Digital Approaches to the History of Science: a successful workshop

‘Digital Approaches to the History of Science’, the first of two planned workshops on this topic, was held at the History Faculty in Oxford on 28 September 2018. A total of nearly sixty attendees assembled to hear presentations from a selection of the most exciting current projects in this field from around the UK.

Professor Rob Iliffe, representing the Newton Project, addressed the ongoing challenges and complexity of digitizing and presenting the manuscript writings of Isaac Newton, and Alison Pearn spoke of the related issues faced by the digital side of the ongoing Darwin Correspondence Project. Lauren Kassell, of the Casebooks Project, introduced a very different type of material and spoke of the need to find new ways of representing, encoding and searching the mass of information contained in early modern medical-astrological casebooks.

After lunch two speakers discussed from complementary perspectives the opportunities represented by the very rich archive of The Royal Society. Louisiane Ferlier discussed the digitization of Royal Society journals and the work needed to clean and link the metadata about the articles in them. Pierpaolo Dondio described his work modelling and visualising the network of authors, editors and referees who controlled the content of those paper, and provided examples of the kinds of research outcomes such work can produce. A final talk turned to the use of digital humanities resources in the university classroom: Kathryn Eccles and Howard Hotson described the Cabinet Project, which has made a rich ecology of digital images and objects available to students on a growing list of Oxford undergraduate papers.

Rich discussions took place both around the individual presentations and over lunch and coffee, and this sell-out event has certainly stimulated interest and ongoing discussion about the distinctive opportunities for history of science created by digital scholarship and resources.

Reflections on discussion topics during the workshop by Pip Willcox

The event was supported by the Centre for Digital Scholarship (Bodleian Libraries), ‘Reading Euclid‘, The Royal Society and the Newton Project, and was organized jointly by the Centre for Digital Scholarship and ‘Reading Euclid’. The date for the second workshop will be announced shortly.

—Benjamin Wardhaugh, ‘Reading Euclid’

Top image credit: René Descartes, Principia philosophiae (Amsterdam, 1644), ‘Cartesian network of vortices of celestial motion’, p. 110. Bodleian Library Savile T 22. Edited in Photoshop by Yelda Nasifoglu.

Working with Spreadsheets: a workshop

Image of hand-drawn spreadsheet

What: Working with Spreadsheets: a workshop

When: 10:00—16:30, Tuesday 21 November

Where: Centre for Digital Scholarship, Weston Library (map)

Access: open to all members of the University

Admission: free

Trainers: Iain Emsley and Pip Willcox

Registration is required: please see below

This workshop is designed for anyone who works with spreadsheets and wants to learn how to explore that data more efficiently and consistently. No prior experience is required. The hands-on workshop teaches basic concepts, skills, and tools for working more effectively and reproducibly with your data.

We will cover data organization in spreadsheets and OpenRefine for managing data.

By the end of the workshop participants will be able to manage and analyze data effectively and be able to apply the tools and approaches directly to their ongoing research.

The workshop draws on lessons prepared by Data Carpentry and adapted by the trainers for use with Early English Books Online Text Creation Partnership data.

The methods that you will learn will be applicable to work in any field that uses spreadsheets. The EEBO-TCP subject matter we will use may be of particular interest to people working with library or early modern data.

Registration

To register, please email Pip Willcox (pip.willcox@bodleian.ox.ac.uk) with:

  • Your name
  • Your ox.ac.uk email address
  • Your departmental affiliation

This workshop is run in collaboration with the Centre for Digital Scholarship and the Reproducible Research Oxford project.

For announcements about future workshops and related activities run by Reproducible Research Oxford, please see the project website, subscribe to the mailing list, and follow the project on Twitter @RR_Oxford.

Equipment

Participants are requested to bring a laptop. To work with with spreadsheets, you will need an application such as Microsoft Excel, Mac Numbers, or OpenOffice.org. If you don’t have a suitable program installed, you might like to use LibreOffice, a free, open source spreadsheet program.

You will also need OpenRefine (formerly Google Refine) and a web browser, and to have Java installed.

If you cannot bring a laptop with you, please let us know before the day.

Trainers

Iain Emsley works for the University of Oxford e-Research Centre on digital library and museums projects. Having recently finished an MSc in Software Engineering, he has started a PhD in Digital Media at Sussex University.

Pip Willcox is the Head of the Centre for Digital Scholarship at the Bodleian Libraries and a Senior Researcher at the University of Oxford e-Research Centre.

Image credit: Stockbyte/Getty Images.

Research Uncovered—Beyond reading: understanding the book through computer vision

Book tickets!What: Research Uncovered—Beyond reading: understanding the book through computer vision

Who: Giles Bergel

When: 13:00—14:00, Thursday 2 November 2017

Where: Weston Library Lecture Theatre (map)

Access: open to all

Admission: free

Registration is required

This talk showcases Oxford’s cutting-edge research at the intersection of book history and computer vision. It aims to make images of books as easy to search, compare and annotate as their texts.

The University’s Visual Geometry Group has a long track record of working with University researchers and collections, building tools to help researchers analyse everything from classical art to fifteenth-century printed books and English broadside ballads, as well as numerous applications in the sciences. Several of these tools have now been openly released for all to use and adapt.

The talk reveals how computer vision, far from detracting from understanding books as material objects, offers a fresh pair of eyes on what remains one of humanity’s most sophisticated inventions and richest forms of heritage.

Dr Giles Bergel is Digital Humanities Research Officer in the University of Oxford’s Visual Geometry Group. He works on printed books, printing materials and the history of the book trade. Find out more information.

Book tickets: http://www.bodleian.ox.ac.uk/whatson/whats-on/upcoming-events/2017/nov/beyond-reading

Reconciling database identifiers with Wikidata

Charles Grey, former Prime Minister, has an entry in Electronic Enlightenment. How do we find his UK National Archives ID, British Museum person ID, History of Parliament ID, National Portrait Gallery ID, and 22 other identifiers? By first linking his Wikidata identifier.

In a previous blog post I stressed the advantage of mapping the identifiers in databases and catalogue to Wikidata. This post describes a few different tools that were used in reconciling more than three thousand identifiers from the Electronic Enlightenment (EE) biographical dictionary.

The advantages to the source database include:

  • Maintaining links between Wikipedia and the source database. EE and Early Modern Letters Online (EMLO) are two biographical projects that maintain links to Wikipedia. As Wikipedia articles get renamed or occasionally deleted, links can break. It is also easy to miss the creation of new Wikipedia articles. As EE and EMLO links are added to Wikidata, a simple database query gets a list of Wikipedia article links and their corresponding identifiers. Thus we can save work by automatically maintaining the links.
  • Identifying the Wikipedia articles of individuals in the source database. These are targets for improvement by adding citations of the source database.
  • Identifying individuals in the source database who lack Wikipedia articles, or who have articles in other language versions of Wikipedia, but not English. New articles can raise the profile of those individuals and can link to the source database. We raised awareness among the Wikipedian community with a project page and blog post. We also arranged with Oxford University Press to give free access to EE for active Wikipedia editors who requested it, via OUP’s existing Wikipedia Library arrangement.

Continue reading

Report from Wikimania

Last month I had to privilege to attend the Wikimania conference in Montreal, Canada, where 900 people from around the world gathered for two days of hacking and building and then three days of conference sessions. The conference scope includes not just the Wikimedia projects but also the big themes of open education, open access, community building, and privacy and rights in the digital age. One blog post by one attendee is only going to capture a sliver of what went on, and here I am summarising some big projects of most relevance to university research projects and GLAMs.

This time round, Wikidata rather than Wikipedia was generating the most excitement. Wikidata, the free structured knowledge-base, is going through a period of explosive growth, helped in a small part by data shared from partner institutions including Oxford University, and the conference brought together many people using Wikidata to document cultural heritage and current knowledge.

The author and hundreds of other Wikimedians. Photo by Victor Grigas of the Wikimedia foundation, CC-BY-SA 4.0

Continue reading

Data Carpentry Workshop for Humanists

You are invited to join a free Data Carpentry workshop run by the Reproducible Research Oxford project. Registration is required.

 

Date: 26–27 September 2017 

Venue: Institute of Cognitive and Evolutionary Anthropology, 64 Banbury Road, Oxford OX2 6PN

 

The workshop will cover data organization in spreadsheets and OpenRefine, data analysis and visualization in python, and SQL for data management, with a focus on humanities data. This is a joint effort with Data Carpentry to develop a (pilot) curriculum for the digital humanities. It is at an introductory level.

See the workshop website for details: https://rroxford.github.io/2017-09-26-oxford/

The workshop is free and open to any member of the University — researchers, staff, and students. It will be particularly relevant to people working with humanities data, though the methods are widely applicable.