Early Modern Texts: Digital Methods and Methodologies
Thank you to everyone who proposed research for our conference. Given the excellence of the proposals we received, we are sorry the conference isn’t longer. We hope to hear more about the research projects we couldn’t fit into the programme in the future.
You can read the conference programme, and find abstracts of the papers to be presented below.
- Robyn Adams, Centre for Editing Lives and Letters, University College London, and Lizzy Williamson, Queen Mary University of London
The Diplomatic Correspondence of Thomas Bodley, 1585-97
- Harriet Archer, Newcastle University
Quantifying Novelty in Tudor England
- Giles Bergel, University of Oxford
Marking up the Material Text: lessons from letterpress
- Daniel Carey, National University of Ireland, Galway
Early Modern Advice on the Art of Travel: an electronic resource
- Thomas Dabbs, Aoyama Gakuin University, Tokyo
Searching the New Labyrinth: the echoes of Mercutio’s banter in Paul’s Cross churchyard
- Gabriel Egan, De Montfort University, and Brett D Hirsch, University of Western Australia/De Montfort University
“Where do/doe we go/goe from here/heere?” Computational Methods in Compositorial Studies of Early Printed Shakespeare Editions
- Alexandra Franklin, Bodleian Libraries, University of Oxford
Image Mining: reading EEBO for the pictures
- Heather Froehlich, University of Strathclyde
Introducing Genderscope: approaching an analysis of gender in Early Modern London plays
- Brett D Hirsch, University of Western Australia/De Montfort University
Quantifying the Early Modern Dramatic Canon: the Bibliography of Editions of Early English Drama (BEEED)
- Anders Ingram, National University of Ireland, Galway
Corpus Linguistics Software Tools and the ‘Turk’ in Early Modern England
- Rupert Mann, Oxford University Press
Complementary Hybridity: the genesis of Oxford Scholarly Editions Online
- Andrew McRae and John West, University of Exeter
Mapping a Genre: towards a database of Stuart succession literature
- Micheál Ó Siochrú and David Brown, Trinity College Dublin
Mapping the Past: Geographical Information Systems and the exploitation of linked historical data
- Michelle O’Callaghan, University of Reading
Using e-Resources in Teaching: Verse Miscellanies Online and the Commonplacer
- Michael Poston and Rebecca Niles, Folger Shakespeare Library
“Some craven scruple of thinking too precisely”: lessons learned from Folger Digital Texts
- Paul Rayson, Alistair Baron, and Andrew Hardie, Lancaster University
Transforming EEBO-TCP into a Corpus
- Mary Erica Zimmer, Boston University
“Digressive Bibliography”: browsing the bookstalls of St. Paul’s
Robyn Adams, Centre for Editing Lives and Letters, University College London, and Lizzy Williamson, Queen Mary, University of London
This paper will introduce the first phase of the Bodley project hosted by the Centre for Editing Lives and Letters (CELL). This phase of the project explores the corpus of letters produced by Bodley and his network of contacts during his diplomatic appointments. The bulk of the edited correspondence derives from his term as English representative on the Dutch Council of State between the years 1589-97, and is a substantial record of his ambassadorial duties. These letters document his negotiations with the States General on behalf of Elizabeth I, and his voluminous reports of military provision, movements and skirmishes to support the financial and military preparation by the English government who were sending troops, armaments and supplies to aid the Dutch in the war against Spain.
During the transcription process, the project team pioneered a new method of XML text transcription which permitted the idiosyncratic features of early modern handwritten documents to be captured at their fullest. These can then be output on the website according to the reader’s specific demands: they can choose whether to opt for viewing abbreviations, contractions, marginalia, line-breaks and many more. The intention was to provide numerous options in the learning environment and to enable multiple constituencies to use the edition.
We are interested in opening the project to wider view, and in discussing the methodologies, editorial apparatus and the various decisions that comprise editorial work in the digital age. The speakers are Dr Robyn Adams (PI), and Dr Lizzy Williamson, (former) PhD student and Research Assistant on the project.
In his breakthrough pastoral work, The Shepheardes Calender (1579), Edmund Spenser was introduced simply as ‘the New Poet’. His claims to innovation, which also coincided with a cluster of similar statements by lesser known writers in 1578-9, would go on to shape modern understanding of Renaissance poetry’s development: C. S. Lewis, for example, dates his Golden Age of English literature from 1580. As a result, the work of late Elizabethan authors such as Spenser, Sidney and Shakespeare continues to be held up as a flourishing of poetic invention. By contrast, the so-called ‘Drab Age’ of 1530-1580 has been neglected as derivative and old-fashioned.
But in a literary culture predicated on imitation and adaptation, in thrall to classical models, why did Renaissance poets see out the 1570s striving to be ‘new’? The digitisation of early modern texts offers new ways to reveal the deep-rooted ambivalence towards innovation in the writing of this decade, a decisive period of imperialism, growing Protestant fervour, and heightened engagement with national identity. In fact, the idea of novelty was radically transformed from something dangerous – a threat to tradition, authority, even national security – to a bold statement of political and patriotic confidence.
This paper will demonstrate the uses to which EEBO-TCP may be put to understand this transformation, and present the results of research carried out using digital resources. Can we trace the emergence of this semantic tension across the sixteenth century as a whole? How accurate are critical assumptions about based on an inherited canon? I will show how statistics such as the frequency of word use can yield productive insights into the fluctuating significance of novelty and antiquity in the period, and explore the ways in which these insights might challenge our received perception of early modern innovation and influence.
This paper will offer some thoughts on the representability of the bibliographical features of early printed books in current cataloguing and markup schemes, focussing on TEI. While TEI is an encoding scheme originally designed for texts as linguistic objects rather than books as material productions, much work has recently been done on adding descriptive markup to TEI for various purposes. Manuscript scholars, for example, have been increasingly well-served by the appearance of tags for describing codicological structures and palaeographical features in recent releases of the TEI. This paper will argue that the needs of bibliographical scholarship are distinct (but not unrelated from) those of codicologists or linguists, and that bibliographers might engage more closely with the TEI process for mutual benefit. The paper will look at recent work in descriptive bibliography for library cataloguing, as well as materialist bibliography, to ask if lessons learnt in those fields can be transferred to textual markup. Noting that no description can entirely substitute for an original or a facsimile, it will ask how much work can reasonably be done in markup and where conventional bibliographical description or facsimile can work best. Noting how early-modern printers coped with various textual conditions – such as the representation of complex diagrams – the paper will argue that projects of remediation, while invariably imperfect and always at the mercy of the technological means available, can often find solutions that are appropriate to the communication of information. Materiality in textual studies should be seen as not only the incommunicable content of a document, but also the concrete and conscious decisions taken by agents such as authors, editors printers to find a good-enough solution for an occasion.
This paper describes a digital project to create a database of early modern advice on the art of travel (a genre known as the ars apodemica). With funding from the Irish government and the assistance of a three-year postdoctoral fellow (Dr Gabor Gelleri), I have been developing this resource to establish the scope and interest of an intriguing but neglected body of literature. In the mid sixteenth-century, and continuing for more than two centuries, a genre developed across Europe in which leading authorities pronounced on the merits of travel – some as advocates and others as staunch critics of the practice. Their object was to inform Continental journeys undertaken by young nobles and gentlemen for the purpose of education, language acquisition, political knowledge, and social refinement. More importantly, this form established a moral theory of travel and created conventions with lasting impact on travel writing, cross-cultural exchange, extra-European travel, and the emergence of anthropology.
The database provides bibliographical entries on over six hundred texts in this genre from c. 1570-1800. They include brief essays and lengthy treatises, letters of instruction, academic orations, and prefatory discourses to published accounts of travel. Among the leading contributors to the form are Sir Philip Sidney, Justus Lipsius, Francis Bacon, Fynes Moryson, Joseph Hall, and a range of figures in Germany, Switzerland, the Low Countries, and France. Mapping tools and other visualizations will form part of the site. We are also experimenting with full text as a complement to the database, using the resources of EEBO-TCP for English-language texts (in part based on the Phase II of TCP which NUI Galway has subscribed to, uniquely among Irish institutions).
The website is at a late stage of development and will hopefully launch in the summer 2013. The conference presentation will introduce the resource and the material it makes available. Suggestions for further development from the audience will be very welcome.
This talk will examine one highly referential speech by Mercutio in Shakespeare’s Romeo and Juliet in order to bring forward certain heretofore hidden elements within the physical architecture and also the architecture of consciousness at St Paul’s during the 1590s. It will focus on the bookselling district of Paul’s Cross Churchyard on the northeast side of the cathedral. To do this EEBO-TCP and other digital resources in various degrees of development will be assessed along with static, hard copy research.
In Act II, scene IV, of Romeo and Juliet, Mercutio sardonically references six romantic heroines, whom, he asserts, Romeo wishes to place his own love interest above (Laura, Dido, Cleopatra, Helen, Hero, and Thisbe). In each case, an EEBO-TCP search shows that the names of the mythological women Mercutio mentions had a comeuppance in then printed works during the 1590s. The title pages of these works and other records indicate that these texts were available from bookshops in Paul’s Cross Churchyard in the years subsequent to Romeo and Juliet. Instead of just being bawdy banter, Mercutio’s speech points to the print marketplace in the churchyard and shows how a Shakespearean play echoed stories and fashions that were popular, not in Verona, but in the City of London.
Recently, the team at the Virtual Paul’s Cross Project has reconstructed much of St Paul’s cathedral during the early modern period. From this reconstruction and also the search capabilities of EEBO-TCP, we can now identify relationships between this area and the Shakespearean stage with far more precision than before. We still do not know precisely how books were retailed, how long they were held in stock, or how popular they actually were. Also certain significant full texts may not be available yet in a TCP search. These pitfalls noted, this talk will abide by the thesis that digital resources can bring us much closer to an understanding of a lost, physical world and also to human consciousness during the early modern period.
Gabriel Egan, De Montfort University, and Brett D Hirsch, University of Western Australia/De Montfort University
“Where do/doe we go/goe from here/heere?” Computational Methods in Compositorial Studies of Early Printed Shakespeare Editions
In a letter of 3 June 1920 to the Times Literary Supplement, Thomas Satchell noted a striking pattern in the First Folio edition of Shakespeare’s Macbeth: 35 words consistently spelt one way in the first half of the play (e.g. “doe”, “goe”, “heere”) are consistently spelt another way in the second half (e.g. “do”, “go”, “here”). Later scholarship confirmed that such patterns were evidence of the spelling preferences of compositors (as opposed those of scribes) and identified additional typographical elements of early printed editions — such as scene headings, stage directions, speech prefixes, and lineation — that reflected compositorial habits.
Compositor-habit identification — now considered a fundamental task of analytical bibliography — continues to rely almost entirely on the methods developed by Satchell and his successors almost a century ago. After a brief historical overview, the proposed paper will introduce a prototype computational method for compositor-habit identification and a procedure for testing previous scholarly claims about the various compositorial stints in early editions of Shakespeare.
Application of image matching software on a sample of early modern illustrated broadside ballads in two projects during 2011-12 [‘Integrating Broadside Ballads Online Resources’ and ‘Engaging with Early Modern Print Online’ (co-directors: Richard Ovenden and Giles Bergel)] provided an opportunity to compare the woodcut images used on the ballad broadsides with impressions in other publications as presented in EEBO. Examples of illustrations delivered by EEBO that matched images in the ballads set will be shown in this paper.
For the Bodleian ballads collections, image matching represented an additional dimension to the treatment of the woodcut illustrations, adding to the existing iconographic index of the woodcuts, made in 1997-9, that relied on human application of the classification system, ICONCLASS.
Whether matching impressions by computer or by eye, the positive identification of woodblocks that were re-used in different publications has bibliographic significance, with the potential to identify the printing house in which a broadside originated. Chronologies can be established for undated works by the cracks and worm-holes that appeared in woodblocks over time.
A different quality of inquiry, establishing all of the contexts of use of a particular visual composition, is of increasing interest to scholars examining the use and re-use of images in early modern print, as reading aids or as meaningful iconographic statements – or both. In this sense image matching offers a support and extension of the work of Ruth Luborsky and Elizabeth Ingram, as well as other scholars working on individual topics, and McKerrow and Ferguson’s works on printers’ devices and title page borders.
As a methodology, image matching therefore corresponds on the one hand to compositor studies of type damage and on the other hand to text-mining of the TCP corpus. What measures will make it viable as a method and acceptable as an intepretive tool in either of these roles? To what extent does it support, enhance, or make obsolete the indexing of illustrations by conventional classification systems?
Through a number of literary and linguistic approaches we are largely familiar with Shakespeare’s writing, including how he portrays male and female characters on-stage. We know about his strong female characters and his tragic heroes. But what about the rest of Early Modern London drama? Through the EEBO-TCP initiative, we have access to all remaining Early Modern London drama, allowing us to address questions using larger quantities of text than ever before.
In this paper I will discuss the iterative process of selecting and encoding literary-linguistic features for a digital text analysis called Genderscope. These literary-linguistic features have been derived from the Shakespeare corpus, focusing on the intersection of gender and formality in drama. I apply this metric to a sample from the EEBO-TCP, using a collection of 400 plays from early Modern London.
In doing so, I suggest ways that Genderscope can be used to help identify specific authors and plays which have been largely ignored by more canonical studies of gender in Early Modern drama. In creating a tool to distance-read gender in Early Modern textual objects, I address some of the difficulties and problems inherent in this process. I suggest ways to consider these as avenues for further research on gender in the EEBO-TCP corpus, while highlighting some plays and authors who might require more attention through more “traditional” close reading.
How do early modern plays become canonical? What roles do editors and publishers play in shaping the reception and meaning of the early modern texts they reproduce?
The Bibliography of Editions of Early English Drama (BEEED) is a comprehensive enumerative bibliography of every edition of early modern English drama produced since the publication of Edward Dodsley’s twelve-volume anthology in 1744. Database entries are extremely detailed, informed by archival research into publishing and editorial histories, allowing for complex queries and quantitative analysis. An open-access, web-based platform (funded by the Australian Research Council) is currently in development.
With BEEED, it will be possible to historicize — for the first time — the processes of canon formation, identifying precisely which plays were available at any given point in time since the eighteenth century. In addition, editors of a play may use the database to compile a bibliography of historical editions to consult for collation; scholars may map and explore editorial and publishing trends; and, educators may easily assess whether suitable editions are available to assign to syllabi. This paper offers an overview of the project’s aims, methodologies, and proposed outcomes.
At the EEBO-TCP conference 2012, a series of papers were presented by the CREME (Corpus Research in Early Modern English) project based at Lancaster University, showcasing new digital tools, methods and methodologies. This paper is a direct response to that conference and will apply the CQPweb interface developed by CREME for Corpus Linguistics to discourse analysis, specifically on the ‘Turk’ as a reoccurring figure in early modern English writing.
Over the past 15 years, under the influence of Edward Said and Nabil Matar, a detailed scholarship has grown up on the ‘Turk’ in various generic contexts. This field has tended to focus upon the symbolic aspects of the ‘Turk’ through close reading of specific sources, particularly drama, reflecting its disciplinary roots in literary criticism. My study will develop a new approach to discourse of the ‘Turk’ by utilising Lancaster University’s CQPweb interface to process and analyse a Corpus of 12,284 early modern English printed texts (624,277,146 words) derived from TCP.
I will search this Corpus for the work ‘Turk’, its variant spelling (e.g. Turke) and derivations (e.g. Turkish, Turcism). I will then sort these occurrences for frequency distribution chronologically, collocation (words appearing in close proximity to the search term), and distribution of those collocations. I will not only examine the contexts in which English usage of the ‘Turk’ peaked, but through collocations, identify key commonplace associations and phrases (e.g. ‘Turks and Pagans’, ‘Turkish Empire’). Having generated these results I will use the CQPweb interface to work back to the results and explain them by relating them to the contexts in which they occur. This method will allow for an empirically based discussion of the ‘Turk’ in early modern writing, drawing upon both quantitative and qualitative analysis, which can then be used to consider the validity of prevailing interpretations. In particular I will question the chronology within which English interest in the ‘Turk’ waxed and waned, and explore current scholarly emphasis on its symbolic aspects as a signifier of difference.
Oxford Scholarly Editions Online is an online product published by Oxford University Press that contains digital versions of scholarly editions of works written by authors chiefly active in the sixteenth and seventeenth centuries. It launched in September 2012, and will grow to include many hundreds of editions of works written in all periods, across many disciplines. Currently, all the editions it contains have already been published in print.
This paper describes some of the thinking behind the project, beginning with the use cases we wanted to support and the decision to capture all the content in XML. It discusses the difficulties of maintaining fidelity to the original editions (enabling the digital version to be consulted and cited interchangeably with the print) at the same time as leveraging opportunities offered by digital publication (enabling, for example, text searching that is restricted to specific elements within an edition). Metadata is crucial in this balancing act, allowing organization of the content along useful axes without disrupting the editions themselves, and instantiating decisions made by our editorial board.
I finally consider what sort of thing we have made. It is clearly in some sense a hybrid, made at a transitional time in scholarly editing and publishing. How do we intend to accommodate born-digital editions in it? And how might users use it alongside massive corpora of texts such as that being created by the TCP?
The AHRC-funded ‘Stuart Successions Project’ is examining writing produced at moments of royal (and protectoral) succession in Britain during the long seventeenth century. At the centre of the project’s outputs will be an online bibliographical database of ‘succession literature’, and this paper will outline some of the principles behind the database’s ongoing development.
The database is being compiled by searching the EEBO records that are available for each succession year between 1603 and 1702 for works that are responding to a succession, with the details of relevant texts being catalogued. As well as recording bibliographical details (full title, year, author, publisher, printer, place of publication, and format), the database will include categorisations of each text (for example, according to monarch, literary form, language, or other royal and political figures). All this information will be cross-searchable and the database will ultimately present records for ‘succession literature’ printed in each year of a royal succession plus one (1603 and 1604, 1625 and 1626 etc.). Alongside this, there will be an appendix of material relating to the Cromwellian successions of the 1650s.
Our paper will aim to outline our project, placing it in the context of relevant developments in the digital humanities. We will present statistical evidence and short case studies from the database in order to illustrate how it has helped us to highlight patterns in the forms and preoccupations of seventeenth-century succession writing that may challenge existing presuppositions about literature and politics in the period. We hope that the paper, as well as outlining a rationale for the creation of a new digital resource, will therefore prompt discussion around the question of how such resources can rejuvenate and set new agendas in particular fields of academic research.
Micheál Ó Siochrú and David Brown, Trinity College Dublin
Mapping the Past: Geographical Information Systems and the exploitation of linked historical data
Early modern sources comprising maps and tabular data require a highly structured digitisation approach to enable their use as an effective research tool. The Down Survey is a set of large-scale plantation maps from 1655-8 covering most of Ireland, which formed an essential part of the Cromwellian land settlement. It is part of a range of tabular sources for Ireland created throughout the 1650s, recording land ownership, valuations and population. These sources are closely related and were designed to be complementary to one another.
The Down Survey Project at Trinity College Dublin has rendered these sources into digital form by means of a bespoke Geographical Information System. The Books of Survey and Distribution and 1659 Census comprise two databases linked by placename. The 17th century placenames are linked in turn to their modern equivalents, datasets provided by the Ordnance Surveys of Ireland and Northern Ireland that provide spatial coordinates for each place. The Down Survey standardised the townland as the basic unit of land in Ireland and this unit is still in use today.
A GIS, built on the open source platforms of Google Maps and Open Street Map, is used to display and interrogate this data online. The GIS tool includes geo-referenced image overlays of Down Survey and 19th-century Ordnance Survey mapping to enable the user to reference Early Modern mapping in a modern context with the data visible on any of the map layers. It will be available on a free public access website, www.downsurvey.tcd.ie, from April, with digital images of all surviving Down Survey parish, barony and county maps.
This structured approach facilitates the incorporation of any text incorporating a placename, allowing spatial analysis to be employed as a research tool on a variety of historical sources, spanning a wide chronological and thematic range.
Verse Miscellanies Online hosts an editing tool, the ‘Commonplacer’, which is intended to facilitate engagement with the processes of selection, modification, and compilation involved in editing texts. Its use within the classroom raises particular questions about how we can encourage students to reflect critically on the editorial decisions they make. Neil Fraistat and Steven Jones have described the exemplary digital edition as an ‘”editorial environment” with spatial, temporal, procedural, performative, and participatory properties’. In this environment, the user has an active and collaborative role in producing the digital resource. Such a model has the potential to transform students’ experience of the text and the learning environment within the classroom. However, as students become editors actively engaged in the production of digital resources what kind of expertise and knowledge base do they need to acquire to carry out this role? How useful are print-based editorial methodologies for understanding the production and use of digital resources?
The Folger Digital Texts (FDT) project seeks to create digital versions of the New Folger Shakespeare Library Editions that will act as the online Shakespeare text of record. This means that FDT texts must be stable, reliable, and indexable by users. We aim to conform to the conventions of digital humanities research in order to provide a base upon which researchers can build. To that end, the dramatic texts are encoded in TEI P5-compliant XML. Every word, textual space, and punctuation mark is assigned a unique identifier.
However, the inherent messiness of the primary text does not always neatly conform to the TEI ontology. In fact, inconsistent early modern textual structures often resist mapping onto rigid TEI hierarchies. In our cases, some difficulty is mitigated because we are dealing with modern editions that impose a level of uniformity. Even so, we are forced to use a highly complex data structure that uses TEI in unconventional or unprecedented ways.
This presentation will explore the negotiations between the TEI guidelines and our applications of it. We will discuss how we have dealt with overlapping hierarchies, unclear/inconsistent hierarchies, missing/incomplete information, and ambiguity in Shakespeare’s dramatic texts. We will explain how creating visualizations early in the process helped us validate our encoding decisions. Finally, we will extrapolate how our techniques can be applied to primary texts that do not exist in a modern-edited form. We anticipate that feedback from the attendees of the 2013 EEBO-TCP conference would be invaluable as we continue to navigate the challenges of the FDT project.
The transcribed versions of the Early English Books material provide an unparalleled collection enabling research in a number of disciplines. At least three communities of users (historical corpus linguists, historians and literary scholars) have begun to explore its potential when viewed as a corpus. At the Oxford EEBO-TCP conference in 2012, the Creme research group at Lancaster University described the groundwork required to enable corpus-based analyses of EEBO-TCP, and how it could be used to investigate how discourses of liberty and revenge changed from 1473 to 1700.
In this paper, we describe the next stages of our work to transform the full EEBO-TCP collection into a corpus. Starting with the 44,219 texts from EEBO-TCP phases 1 and 2, we have added two levels of linguistic annotation using automatic natural language processing (NLP) tools. First, grammatical annotation (part-of-speech tagging using CLAWS) allows the researcher to study lexical and grammatical features. Second, semantic field annotation (using the USAS tagger) facilitates searching for concepts rather than simple words or phrases. Using automatic NLP tools on historical texts typically falls foul of the large amount of spelling variation they contain and last year we introduced pilot research in this area. Here, we will describe the two stages of research using the Variant Detector tool (VARD) to manually link modern equivalents to historical variants in large text samples per fifty year period thereby enabling a more accurate automatic linking process on the remainder of the EEBO-TCP texts. Finally, all texts are indexed using the CQPweb corpus software to enable fast search, retrieval and analysis of the words and their linguistic annotations.
An important consideration in our research is representativeness since this will impact on all types of quantitative study carried out on the EEBO-TCP corpus. We need to consider the distribution of the texts in EEBO-TCP to address issues of comparability across time, genre and text type. In order to be confident about quantitative results derived from the EEBO-TCP corpus, we need to have a clearer picture of what proportions of EEBO material per decade have been transcribed and whether historians would consider this a good representative sample of books available at the time. This should also be kept under review as further EEBO-TCP data becomes available.
The envisioned project will focus upon mapped bookstalls in a delimited area of Paul’s Cross Churchyard in London, before the destruction of St. Paul’s and its environs in the Great Fire of 1666. Building upon Peter Blayney’s The Bookshops in Paul’s Cross Churchyard (1990), the project will render digitally maps already created, then use existing EEBO-TCP scans to place the works themselves within the “shops or sheds” from which they were sold. In doing so, this model will organize EEBO-TCP texts spatially and temporally while encouraging insights developing from Blayney’s seminal research. To the extent searchable full texts of these texts are available, the project’s database design will also bring forward further connections among books, their makers, and their environments, building upon existing EEBO-TCP search functions to enrich understanding of this textual corpus. Creating this virtual marketplace of ideas, works, and contextual information will allow audiences to browse the bookstalls from a first-person perspective, while harnessing the capabilities of what Martin Mueller has termed “scalable reading” to reveal patterns of wider interest.
Through a visual interface for which Blayney’s maps will provide underlying structure, the project will build upon and nuance current pathways to and through EEBO-TCP texts. Data fields envisioned beyond existing EEBO-TCP options would include searches by “physical format” and “printer/publisher,” with full text search capabilities also nuanced, as feasible, to show the frequency of particular keywords, as well as clusters of terms within and among works.
As noted, capabilities in this area would depend upon the availability of searchable full-text files. In addition to using files currently transcribed by hand through the Early English Books Online-Text Creation Partnership (EEBO-TCP), this dimension might be supported through efforts to improve Optical Character Recognition (OCR) for early modern texts, such as those taking place under the direction of Laura Mandell at Texas A&M University. Search capacities might also be developed through collaboration with the University of Lancaster’s Corpus Research on Early Modern English (CREME) group. While challenges are acknowledged, sustained conversation regarding possibilities would be welcome.
Integrating such a project with the existing architectures of EEBO-TCP and other resources will require sustained, thoughtful conversation, with the goal of creating further avenues of engagement both valuable and attractive to those concerned. Two further resources with which this project might link, for example, would be the University of Oxford’s work to create a digital edition of the Stationer’s Register as edited by Edward Arber and the Oxford Scholarly Editions Online database, launched in September 2012.
Ultimately, the project will serve as both a reference valuable in its own right and a portal of access able to be extended in years to come.