A TEXT CREATION PARTNERSHIP
Companion

TCP Production Files

⇐ Return to main (index) page.

It is one thing to speak blithely about the creation of SGML/XML-encoded text, such as was created from early printed books under the auspices of the Text Creation Partnership at the University of Michigan Library; quite another to anticipate all the quirks of format, structure, and symbology early printers and authors could come up with during a singularly inventive age.This directory contains files used during the course of production to document decisions about the appropriate capture and encoding of features encountered in early print. The original keying and coding guidelines were written in a week; as books appeared, those guidelines, and the minimalist schema that went with them, needed to be revised and applied. Thus these documents, very much ad-hoc documents, created in response to actual questions and actual books As such, these are working files not intended originally for public distribution, but preserved here for the insight they might provide into the texts that the TCP produced -- as a record of decisions made, both wise and foolish.

VENDOR DOCUMENTATION

Keying/encoding instructions, version 3 (partial revision 2004)
Detailed guidelines for capturing the textual information in the EEBO items. (Version 1 and Version 2 are still available).
Sample pages
Index to 25+ sample pages from potential EEBO items, each presented as a page image or pair of page images (in .pdf) and a corresponding transcription (in SGML).
Calculating EEBO error rates
Documentation of sampling procedures and error-rate calculations
Examples of errors
Examples of "excusable" and "inexcusable" character-level transcription errors
"Illegible" ($) overused (1)
Examples of text unnecessarily marked as illegible
"Illegible" ($) overused (2)
More examples of text unnecessarily marked as illegible
"Illegible" ($) overused (3) and (4)
Yet more examples of text unnecessarily marked as illegible
The other extreme: guessing
Examples of text captured without sufficient warrant in the damaged original
More of the same
More examples of creative capture
Roman numerals
Two special problems with roman numerals: overlining and backwards-c
TEI guidelines
TEI P3 documentation, including element-by-element descriptions
EEBO tagging "cheat sheet"
Supplies a summary description of each of elements of the EEBO tag set (prepared for purposes of internal training)
DIV TYPEs
List of common and preferred values for the TYPE attribute
Decorated initials
Page of sample decorated initials (and non-decorated large initials for comparison)
Apothecaries' symbols
Capture of apothecaries' symbols (ounce, dram, scruple, etc.) as found in medical recipes.
Alchemical symbols
Some samples of alchemical symbols, with suggestions for capture (draft)
Unusual symbols used as note markers
Suggested capture for notes that use unusual symbols as markers.
Noting subtle font changes
Examples of subtle typeface changes to be marked with <HI> or <Q> (etc.).
Inverted letters
Examples of letters accidentally printed upside-down.
Sample alphabets from the Caxton's press: ; His 'type 1' font ; His 'type 2' font ; His 'type 3' font ; His 'type 4' font ; His 'type 5' font ; His 'type 6' font
Alphabets and letter combinations extracted from Caxton's type fonts, with tentative instructions on capture (to be revised as various letter combinations are seen in context in the books themselves).
Additional symbols
A supplement to the main keying instructions.
Character capture issues (March 2005)
Five proposed areas of change and innovation in character capture:
When there's nothing there...
Quick summary of the treatment of blanks and things missing.

INTERNAL (REVIEWERS') DOCUMENTATION

[in progress] All characters list
Experimental list of all available character entities, with pictures. [will never be as up to date as the auto-generated charent list on which it is based.]
All character entities
List of all available charents (both TCP-created and ISO sets) with displayable forms as used in derivative XML version of texts [auto-generated from character map file (see below)].
Additional symbols/charents
Growing list of symbols for reviewers to recognize and supply beyond those in vendor instructions
More odd uses of symbols and characters
Especially math
Overview of review process
Basic guide to the inhouse review process as a whole
How to proof
Step-by-step guide to the proofing stage (preparing and proofing sample)
How to review
Step-by-step guide to the tag-review stage (reviewing and correcting book)
How to end
Step-by-step guide to the final stage (checking in and reporting)
More Latin abbrevs
Further examples of Latin abbreviations, etc.
Ambiguous abbreviations
Examples and draft policy on ambiguous characters and symbols, with examples. (also includes more examples of apothecary's measures)
Anglo-Saxon type
[now moved to vendor area]
Greek type and ligatures</a
Early modern Greek type and its characteristic forms and ligatures: introduction and a few unorganized samples
Hijacked symbols
Some thoughts on symbols pressed into duty against their will.

Reviewers' questions and tips relating to ...

Structure
Using DIVs to group like things; Using GROUP instead of BODY for several texts with common title front and/or back matter; DIVS and LETTER tags; Songs embedded in plays; Using Q for "raisins in oatmeal"; OPENERs and CLOSERs as holdalls; Dialogues and Catechisms: Questioner and Responder. When pages are in the wrong order.
Notes and Milestones
Note markers; Note placement; Handling endnotes.; STAGE and NOTE combined; Use of MILESTONE unit attribute; MILESTONEs with illegible values; Multiple notes with a single reference
Captions, Headings, and Quotations
Captions in figures; ARGUMENTS in verse; Quotations on title pages; Authorial interjections in quotations; Changing &startq; into <Q> and <HI>; <Q>s broken by <P>s. Q+BIBL inside HEAD. Q+BIBL inside TRAILER. Using running header for division header. Placement of epigraphs.
Letters
New tag: POSTSCRIPT; DIV versus LETTER; SALUTE and SIGNED; Use of DATELINE and DATE (DATELINE and SIGNED, DATELINE without DATE, Including dating system within DATE); Sample CLOSERs with problems; Correct sample CLOSERs and SIGNEDs; Lists of signatories.
Matters philosophical
Correcting illegibilities; Counting in/excusable errors; Purpose of DIV types; Printer's errors.
Matters miscellaneous
Superscripts, including superscript o; Clarifying UNCLEAR; Long or short lines in verse; Abbreviations and abbreviation entities; Tagging "Explicit"s; editing TABLEs; Acrostic poem; "Spoken by..." in plays; letters for rubricator.
Title Page matters
Proofing the title page; Handling epigraphs on title pages; Imprimaturs, approbations, licenses
Software tips (esp. TextPad)
TextPad clip libraries; TextPad upgrades; TextPad syntax file (for color-coding tags); downloading EEBO pdfs.
Divisions (DIVs)
Assigning div types; Sample div types; Use of "N" attribute alongside "TYPE"
Lists
Lists with curly braces; Genealogies as lists; Tables of Contents and Indexes as lists; Changes to the model of LIST; Syllogisms as lists
Character capture issues
Z and yogh in Scottish texts; Other uses of z; I/J; Illegibilities

Code

For internal use only

Vendors' coding and capture queries (all very old)

  1. (No. A1) Re: Drama tags (<SP>, <SPEAKER>) in non-dramatic dialogs. Marginal notes and numbers in prose texts. Page-level illegibility (see now P12 instead).
  2. (No. A2) Re: Milestones.
  3. (No. A3) Re: Musical notation.
  4. (No. A6) Re: Single table, illustr., etc. spanning multiple pages.
  5. (No. P1) Re: Marginal notes and numbers in prose texts. Strange "q"-like character in Latin passage.
  6. (No. P2) Re: Odd characters: stars, pointing fingers, and dot-triplets.
  7. (No. P3) Re: Braces; <STAGE> directions; marginal notes IMPLICITLY linked to asterisks in the text.
  8. (No. P4) Re: Interlinear numbers in a "puzzle" poem; <SPEAKER> tags; <SPEAKER>s identified only by number.
  9. (No. P5) Re: ee and oo ligatures with acute accent marks
  10. (No. P7) Re: Numbers appearing usually (but not always) at beginnings of <P>s; specialized vs. default (fallback) tagging; blocks of text after FINIS.(<BACK> matter).
  11. (No. P8) Re: identifying <LETTER>s buried in running text.
  12. (No. P9) Re: missing t.p.; verse paragraphs; poetic letters; analytical summary table of contents; list vs. table; lapidary inscriptions; fractions; mismatched catchwords (missing pages?)
  13. (No. P10) Re: text attached to figures; acrostics printed at an angle.
  14. (No. P11) Re: in-line figures; overlining (of roman numerals).
  15. (No. P12) Re: damaged and illegible text; out-of-sequence pages
  16. (No. P13) Re: song lyrics interspersed with musical notation
  17. (No. P15) Re: duplicate pages: capture both or one & if the latter, which one?
  18. (No. P16) Re: right-justified words at ends of verse lines
  19. (No. P17) Re: multiple typefaces used concurrently, partly to mark quotations
  20. (No. T1) Re: miscellaneous tagging problems exemplified.
Question log (1) regarding the bidding process
Questions (with answers) received from data conversion firms, as well as updates and announcements.
Question log (2) regarding setup and production
Questions (with answers) received from data conversion firms, as well as updates and announcements.

Accumulated Wisdom garnered by the Oxford staff

N.B.: this section appeared originally on the web site of Oxford's Bodleian Library, and represented (mostly) a compilation of email responses to particular issues in the capture and encoding of early modern books.

Encoding

<ADD>
<CLOSER>
DIV types
Drama
<FIGURE>
<GAP>
<HEAD>
<LETTER>
<LG>
<LIST>
Music
<NOTE>
<OPENER>
<Q>
Structure
<TABLE>
Title Pages

Transcription

Abbreviations and Ligatures
Fonts
Foreign alphabets
Miscellaneous
Punctuation
Unresolved Queries
Symbols

Technical

DTD and image sets
Other

Miscellaneous

Matters philosophical
Correcting illegibilities; Counting in/excusable errors; Purpose of DIV types; Printer's errors.
Matters miscellaneous
Superscripts, including superscript o; Clarifying UNCLEAR; Long or short lines in verse; Abbreviations and abbreviation entities; Tagging "Explicit"s; Editing TABLEs; Acrostic poem; "Spoken by..." in plays; Letters for rubricator.