A series of books published around the turn of the 20th century are crucial to modern bibliographic research: they are biographical dictionaries of booksellers and printers, including addresses, dates and significant works printed. Some of these books are out of copyright and available as scanned pages, allowing us not only to copy them into new formats, but adapt them into new kinds of resource.
These scanned books could be made more useful to researchers in a number of ways. Text could be meaningfully segmented, by dictionary entry rather than by page or paragraph. The book’s internal and external citations can become links, for instance linking a proper name to identifiers for the named person. The book can even have an open data representation which other data sets can hook on to, for example to say that a person is described in the book.
This case study describes the transformation of one of these books, Henry Plomer’s A Dictionary of the Booksellers and Printers who Were at Work in England, Scotland and Ireland from 1641 to 1667 using Wikisource, part of the Wikimedia family of sites. As a collaborative platform, Wikisource allowed Bodleian staff to work with Wikisource volunteers. We benefited from many kinds of volunteer labour, from correcting simple errors in the text to creating custom wiki-code to speed up the process.
A lot of important data sets only currently exist in the form of printed books, including catalogues, dictionaries and encyclopedias. We adopted a process that has already been used on some large, multi-volume works and could be used for many more. Continue reading