| A TEXT CREATION PARTNERSHIP Companion |
⇐ Return to main (index) page.
The Text Creation Partnership was a consortium of (mostly) university and college libraries, led by Michigan and Oxford, that joined together to create standardized, accurate, and faithful XML/SGML-encoded electronic text editions of early printed books. We transcribed and marked up text -- through manual keying, rather than optical character recognition (OCR) from about six million pages, as accessed through the scanned page images in ProQuest’s Early English Books Online, Gale Cengage’s Eighteenth Century Collections Online, and Readex’s Evans Early American Imprints.
The project created more than 70,000 transcribed and encoded historical texts. Its scope and scale remain unprecedented among digitization and text-encoding projects of its kind.Our policies were imbued with a librarian's attitude toward content: a resolve to prepare materials without agenda or bias, and with a view toward wide use and reuse.
Through our partnership with private vendors, we had access to a huge trove of images from which to transcribe. In return, these companies were supplied with a full-text index to their images —work which would have otherwise been difficult or expensive to produce.
Our work was jointly funded and is owned by more than 150 libraries worldwide. These libraries owned the transcriptions from the moment they were created and were committed to making it publicly available.
Full text can be searched using web interfaces provided by the University of Michigan Library and elsewhere, for subsets and modified (often improved) versions of the files have also been hosted by other universities. Readers may want to look, for example, at the Early Print Project, the ARTFL site, and the English Corpora site. In the UK, previously also JISC's Historical Texts portal (now discontinued).
Any institution or individual is free to host the texts in any system or interface they choose, or use or re-use them in any way they choose. Use or distribution of the underlying images is subject to licensing terms by their commercial providers, and is restricted to those who subscribe to the commercial databases containing them.
Arrangements between the Text Creation Partnership, its partner libraries, and its corporate partners have differed slightly between projects, but the overall structure in every case has given partner institutions, their students, and faculty the right to store, host, distribute, share, manipulate, alter, analyze, and otherwise work with the content from the moment of its creation. The same arrangements always provided for a "window of exclusivity" during which partner institutions were obliged to restrict any further distribution to other partner institutions -- followed by a removal of all restrictions, perpetual ownership, and perpetual public access.
All four of our major projects (Evans, ECCO, and EEBO Phases 1 and 2) have concluded their period of exclusivity, and all texts created by those projects are now free from all licensing or copyright restrictions. EEBO Phase 2 was the last to be liberated in this manner, as of 1 August 2020. Libraries that signed a Local Management Agreement and had hitherto been subject to its terms for licensing and access at that point became free to disregard the agreement as moot.
Our project began in 1999 as an experimental partnership among the university libraries of Michigan and Oxford, ProQuest, and the Council on Library and Information Resources (CLIR). The goal of the project was to produce standardized, digitally-encoded electronic text editions of 25,000 titles from ProQuest’s Early English Books Online.
A working group developed an SGML DTD derived from TEI P3, the text encoding standard at that time, influenced by variants of TEI-Lite current among many library-based digitization shops. Staff at U-M Library developed a set of capture- and encoding-instructions using this DTD. Data-conversion vendors were asked to submit bids for keying and markup using these instructions. And thus work began: Texts were selected each month at Michigan, page-images were supplied by ProQuest, marked-up transcriptions were submitted by the vendors, and quality control and editing undertaken at U-M Library and soon also at Bodleian Libraries in Oxford and subsidiary sites at the National Library of Wales, Aberystwyth, and at the University of Toronto.
During its course, TCP employed nearly a hundred editors and immediate production-related staff in Ann Arbor, Oxford, Aberystwyth, and Toronto, as well as dozens more in other roles, supporters willing to serve on the executive board and ad hoc working groups, and hundreds of keyers, editors, quality-control specialists, encoders, and managers at four data-conversion firms (Apex, SPi, Aptara, and AELData).
EEBO-TCP met its goal of producing 25,000 books in 2009 (those books thereafter known as "EEBO-TCP Phase 1"), and then undertook work on a second phase to convert the first edition of each remaining unique monographic work in EEBO—another 40,000 or so books, for a total of around 70,000, if all hopes were realized.
In 2005, the TCP executive board and staff sought to expand the TCP model to other databases of historical books, namely, Gale Cengage’s Eighteenth-Century Collections Online (ECCO) and Newsbank Readex’s Evans Early American Imprints (Evans-TCP). These projects never received quite the support attracted by EEBO-TCP, and in the end produced only about 8,000 texts, compared to the 60,000 produced by the latter, with another few thousand on the way.
Though the TCP was almost entirely self-funded -- funded by its members -- it also was the recipient of an NEH grant. Under this grant, a subset of texts related to travel and navigation were identified and converted to the same standards as other works. The resultant collection, EEBO-TCP Collections: Navigations was made possible by a Humanities Collections and Reference Resources grant from the National Endowment for the Humanities (NEH) Division of Preservation and Access. The project ran from May 1, 2013-October 31, 2015. Though as a matter of timing this tranche of texts belongs to EEBO Phase 2, as a federally funded project the Navigations texts were from the outset free from the Phase 2 restrictions and open for public use.
Questions are welcomed at tcp-info@umich.edu.