A TEXT CREATION PARTNERSHIP
Companion

About ECCO-TCP

⇐ Return to main (index) page.

⇐ Return to "About the TCP" page.

ECCO-TCP resulted from a a partnership with Gale, part of Cengage Learning, created to produce about 3,000 highly accurate, fully-searchable, SGML/XML-encoded texts from among the 150,000 titles available in Gale's Eighteenth Century Collections Online (ECCO) database.

About ECCO-TCP and Eighteenth Century Collections Online

Gale's ECCO (Eighteenth Century Collections Online) includes significant English-language and foreign-language titles printed in the United Kingdom during the 18th century, along with thousands of important works from the Americas. The database contains more than 32 million pages of text and more than 205,000 individual volumes in all. In addition, ECCO natively supports OCR-based full-text searching of this corpus. This is significant because it meant that unlike EEBO-TCP (which produced searchable text where there was previously none at all), ECCO-TCP could only hope to produce more accurate text (and more reusable text) than what was already available. The larger size of ECCO (because of the great increase in printing and greatly enhanced chances of survival of printed works in the 18th century) also made it a different proposition: nothing so ambitious as EEBO-TCP coverage was feasible for ECCO-TCP.

Creating full-text transcriptions

Because of these greater challenges facing ECCO-TCP, it is perhaps better described as a proof of concept than as a completed project. With the support of more than 35 libraries, the TCP keyed, encoded, edited, and released 2,473 ECCO-TCP texts. A further tranche of 628 texts was keyed and encoded but never fully proofed or edited. The texts in this group remain useful for many purposes, however, and bring the total of ECCO-TCP texts to over 3,000. In cooperation with Gale Cengage, these texts have been made freely available to the public. To users working with the EEBO-TCP texts, the ECCO-TCP texts may form a useful adjunct, since for the latter some attempt was made to select works by authors who straddled the divide between the 17th and 18th centuries, the thought being that authors whose earlier works we had included in our 17th-century corpus could be "completed" by having their later works included in our 18th-century (ECCO-TCP) corpus. That helps account (for example) for the heavy representation of Defoe in ECCO-TCP. ARTFL (Chicago), for one, has taken advantage of this combinattion by offereing a combined EEBO + ECCO search.

Uses and availability

Because there are no longer any restrictions on how the ECCO-TCP texts may be used and shared, users have made this data and metadata available in various forms and formats around the Web. Aside from Michigan's TCP site, for example, the ARTFL site offers access to the ECCO-TCP corpus via its PhiloLogic search engine (Thanks to Robert Morrissey), The Oxford Text Archive (thanks chiefly to Sebastian Rahtz) offers the ability to search the ECCO-TCP texts as well as to download them in TEI P5 XML, EPUB, or HTML. The "Corpus of Late Modern English Medical texts", created at Helsinki by I. A. J. Taavitsainen and T. Hiltunen, was based on ECCO-TCP medical texts requested specifically by those researchers (by "Late Modern English" they mean the later span of Early Modern English, viz., the 18th century.) The ECCO-TCP files have been particularly favored by those seeking to improve the accuracy of OCR on historic typefaces. They were, for example, heavily used, and hosted, by 18thConnect during their OCR experiments as well as by the "emop" project, using our files as their "ground truth."