HTMLlat1x.ent:     ISO 8879, provided this notice is included in all copies.
HTMLlat1x.ent:<!ENTITY middot "&#183;"> <!-- middle dot = Georgian comma
HTMLlat1x.ent:                                  = Greek middle dot, U+00B7 ISOnum -->
HTMLlat1x.ent:<!ENTITY divide "&#247;"> <!-- division sign, U+00F7 ISOnum -->
HTMLspecialx.ent:     ISO 8879, provided this notice is included in all copies.
HTMLspecialx.ent:<!ENTITY zwnj    "&#8204;"> <!-- zero width non-joiner,
HTMLspecialx.ent:<!ENTITY zwj     "&#8205;"> <!-- zero width joiner, U+200D NEW RFC 2070 -->
HTMLsymbolx.ent:     ISO 8879, provided this notice is included in all copies.
HTMLsymbolx.ent:<!ENTITY equiv    "&#8801;"> <!-- identical to, U+2261 ISOtech -->
HTMLsymbolx.ent:<!-- dot operator is NOT the same character as U+00B7 middle dot -->
ThML10.dtd:This document describes the Theological Markup Language (ThML), an XML markup language for theological texts. ThML was developed for use in the Christian Classics Ethereal Library (CCEL), but it is hoped that the language will serve as a royalty-free format for theological texts in other applications. Key design goals are that the language should be (1) rich enough to represent information needed for digital libraries and for theological study involving multiple, related texts, including cross-reference, synchronization, indexing, and scripture references, (2) based on XML and usable with World Wide Web tools, (3) automatically convertible to other common formats, and (4) easy to learn and use. ThML is defined as an XML DTD that extends the Voyager DTD for HTML.
ThML10.dtd:The study of theology involves uses of texts that are infrequent in other areas of study. Theological books usually make many references to the Bible, including quotations, commentary, explanations, and citations; special processing for scripture references can aid study. Theological study also often involves texts available in multiple variations or translations, sometimes in Greek or Hebrew, which may have to be synchronized and displayed in parallel columns. It may involve hymns and multiple media. It involves the use of cross-reference systems such as Strongs numbers, various sorts of indexes, and the synchronization of multiple texts in various ways, as for example layers of commentary on a text. Theological study often makes use of several texts related by subject or scripture reference, and tools that support library-wide searching by subject, scripture reference, name, or date would also be desirable. It should be possible for students to relate, combine, or comment on parts of texts in their own document, without altering the originals. A markup language for theological texts should support these applications.
ThML10.dtd:For digital libraries, bibliographic information similar to that stored in a library card catalog should also be represented in the text. This information can be loaded into a software system similar to a library's Public Access Catalog (PAC) to provide an interface for locating books, though unlike a traditional library, searches can also be based on the contents of books. This information should be stored in a standard format so that it can be exported to other PACs and search engines, and searches can look through Internet-based digital libraries and PACs as well as documents on the local computer.
ThML10.dtd:Multiple formats for electronic resources are a reality in today's computing environment. Theological study would also benefit from a markup language that is formally defined, with software for converting to and from other formats. And the language should be royalty free, so that texts can be prepared and distributed by small publishers, scholars, students, and other interested parties as well as large publishers. Ideally, no special purpose software for theological study should be required. This would be possible if all the desired applications could be supported by a web browser, for example. As a result, users would not be "locked in" to a particular proprietary software system. And the language should integrate with the World Wide Web, so that public domain texts can be downloaded from on-line libraries such as the CCEL. It should not be obvious to the user whether a text is on a CD-ROM, the local hard drive, or elsewhere on the Internet, except perhaps by access time.
ThML10.dtd:Existing markup languages do not meet all of these needs. Word processor formats don't represent semantic information about a text-an area in which HTML is also weak. A lack of semantic information makes searching, indexing, and converting to other formats and using for other purposes more difficult. The Text Encoding Initiative (TEI)1 application of SGML is semantically rich for literary analysis but not easy to learn or tuned for theological study. It doesn't offer special handling of scripture references or Strongs-like reference systems, for example. Also, the language is very large and the overhead required to learn and process the language is high. Commercial formats, including STEP2 and the Logos Library System (LLS)3, are not designed for integration with the Internet, and preparing texts for these systems requires expensive software, beyond the means of most individuals. Publication in one of these formats may also be controlled by the company or consortium in question. As a result, few public-domain or on-line texts are available in these formats.
ThML10.dtd:After a couple of years' experience with a digital library and thought about the information that needs to be represented in theological texts, the first version of ThML was designed in the summer of 1998, and it has been undergoing scrutiny and improvement since then. This paper describes version 1.0 of the language. Preliminary tools exist for converting Microsoft Word documents to ThML and for converting ThML documents to HTML webs. One open source software project (OpenBible) for Unix is using ThML and a couple of other groups are considering its use. There is a mailing list for those interested in defining and using the language. Several ThML documents exist and have been validated, and others are underway, including Schaff's History of the Christian Church and Calvin's Commentaries.
ThML10.dtd:For the use of digital libraries, bibliographic data about the text should be represented. Markup of subject index entries, scripture references and commentary, names, citations, and dates can be used to build library-wide indexes of those items and assist searching. The original pagination of the document should be represented for bibliographic references. Also, it should be possible to specify a "chunking" of the document for delivery over a low-bandwidth connections such as modems.
ThML10.dtd:Since the primary means of delivering digital libraries such as the CCEL are the world wide web and CD-ROM, the language should use web-based technology, including XML and Unicode, and be usable with web browsers. Therefore, the design goals for ThML are these:
ThML10.dtd:* It should be usable with the world wide web and web-tool-based delivery on CD-ROM
ThML10.dtd:Since Theological Markup Language is based on HTML and XML, it supports all of the markup of HTML, a rich linking language in XLink, and stylesheet support in CSS and XSL. HTML may be used for markup of emphasis, paragraphs, headings, lists, tables, block quotes, images and multimedia, scripts, etc. Links may make use of the extended pointer and link types associated with XML, and formatting will be specified in XSL. These facilities will be used wherever possible, to make the language easier to learn for those who already use HTML and easier to use with the World Wide Web. 
ThML10.dtd:<div1> is used for top level parts of a text, including the title page, preface, table of contents, chapters, index, etc. Additional levels are used for lesser structural divisions of the document. These structural divisions show the structure of the original text, and they are also used to prepare a table of contents and allow splitting of a text info files and access to a text by section. The optional title attribute is used in constructing a table of contents and may be used for running heads or other identification purposes. The optional type and n attributes may be used to specify the type and number of the division. If they are present, they will also be used to identify the section in the table of contents. 
ThML10.dtd:It is often useful to know the page breaks from the print edition of a book. They may be used as targets for subject index entries identified by page number or to display a text with the pagination of the print edition. Page breaks are marked by the insertion of <pb /> tags, with the n attribute giving the page number of the upcoming page (<pb n="37" /> or <pb n="xii"/>). These elements should appear at the start of the identified page. Many electronic texts will also have images of pages available on line. The pb element will also take an href attribute specifying a URI for an image of the page (<pb n="37" href="gif/0021a.gif" />). 
ThML10.dtd:The parsed attribute consists of a number of references separated by semicolons. Each reference includes the version, book, from chapter, from verse, to chapter, and to verse separated by the vertical bar ('|') character. The version may be omitted. If the from chapter is zero, the whole book is identified. If the from verse is zero, the whole chapter is specified. If the to chapter and verse are zero, a single verse (from chapter:from verse) is specified. 
ThML10.dtd:Also, context is sometimes necessary in interpreting a reference. A passage may refer to Romans 8:28 at one point and later to verses 29 and 30 and chapter 10:8-13. It will be up to the editor to identify and mark scripture references, although software may be provided to identify possible scripture references and suggest an interpretation such as <scripRef passage="Rom. 8:29,30">verses 29 and 30</scripRef>. The <scripContext version="NIV" passage="Romans 8" /> element may be used to set the default version, book, or chapter for upcoming references to the parser. 
ThML10.dtd:Explanation or commentary on a passage involves a semantic relationship between the explanation and the passage explained. This relationship should be represented in the text in order to be able to build an index of scripture commentary. For example, it would be useful to be able to see everything the early church fathers said or preached on a passage. Commentary or explanation of a passage will marked with a <scripCom/> element, as in this example:
ThML10.dtd:Software tools may be provided to use this information in a variety of ways. For example, a program would be able to find other passages on related topics or create an index using the Strongs numbering. Multiple different manuscripts of the same original text could be aligned this way, and displayed in parallel columns, with appropriate software.
ThML10.dtd:In a ThML document, footnotes, endnotes, etc. are all marked with the <note> tag, following the syntax used by TEI Lite8 for the most part. The note element may take the following attributes: place, resp, target, targetEnd, and anchored. The place attribute specifies how it appears in the text (e.g. end, foot, inline, or margin). The target (and targetEnd) attributes refer to the start (and end) of the text being annotated, if the note does not occur in the text at its reference point. These attributes allow the notes to be gathered at the end of a chapter or file if desired. The resp attribute identifies the person responsible for the note-for example, the author, editor, or a person's initials. The anchored attribute make take the value yes (default) or no, specifying whether the note is anchored at an exact location; margin notes typically are not anchored. 
ThML10.dtd:Citations of other works such as books or treatises may be marked with the <citation> element. That element may also take an href attribute to specify a URI for the cited work, if available. The <date> element may be used to mark dates that occur in the text. A value attribute may be used to specify the date in ISO format, as in this example: <date value="1997.12.25">last Christmas</date>. These elements are used in the CCEL to aid searching and indexing: the insertIndex element (described in the next section) may be used to insert an index of attributions, works cited, dates, or personal names, and searches for names, attributions, dates, etc. can also be supported.
ThML10.dtd:ThML also has support for hymns and hymnals. The <hymn> element may contain elements <meter>, <author>, <tune>, <composer>, <incipit/>, and <music>, in any number and order, as well as other inline elements such as paragraphs, headings, <verse>, etc. The <author> and <composer> elements take authorID and composerID attributes, which may contain a standard identifier such as the CCEL personIDs, as well as the type attribute. The tune element contains the tune name; it may also have a tuneID attribute containing the CCEL tuneID for the tune.
ThML10.dtd:The <meter> element contains the meter as represented in the hymnal. It may also take the standard attribute, giving the meter in a standardized form. The <incipit/> element may be used to represent the first line of the melody as it would be if transposed to C, using f and s for accidentals, e.g. 
ThML10.dtd:Finally, the <music/> element gives a link to an electronic format for the music, perhaps a PDF page image or a midi file. The element takes the href attribute, giving the target, and the type attribute, giving the MIME type of the target. It may also take the inline attribute, with the value yes or no, and the actuate attribute, with the value auto or man. For images, specifying inline="yes" loads the image onto the page; otherwise it is loaded in a separate window. Specifying actuate="auto" makes the image appear automatically; otherwise it appears when a link is clicked. For audio files, specifying inline="yes" brings a player control onto the page. So, for example, to specify that a midi file be played automatically when a page is loaded, and that a player control be displayed, one could use
ThML10.dtd:<music href="hymn.midi" type="application/midi" inline="yes" actuate="auto"/>
ThML10.dtd:The title attribute is used as the identifier in the Table of Contents. The type attribute is used to identify the index that this reference is to be added to; values used for the CCEL include subject (the subject index for the book) and globalSubject (the library-wide subject index). Other values may be used for specialized indexes.
ThML10.dtd:A document may have several user-selected types of index entries. An XML element (<insertIndex type="subject" />) is also provided to specify that a sorted, hierarchical index of all the "subject" (e.g.) index entries should be inserted at that point, with links to the appropriate locations in the text. Certain additional index types are also understood: <insertIndex type="name" /> inserts an index of all names marked with the <name> element; if the title attribute is present, it is used as the index entry. Similarly, indexes may be inserted for citations, dates, foreign words and phrases, images (<img>), names, scripture references (<scripRef>), and scripture commentary (<scripCom>).
ThML10.dtd:Software tools may be provided for composing two documents (which may be the same), using the glossary in one and the text in another. Words of the text defined in the glossary could be footnoted, underlined and linked, or defined in a separate window.
ThML10.dtd:The added and deleted elements take the optional attributes resp, which identifies the person responsible for the changes; reason, identifying the reason for the change, and date, the time of the change.
ThML10.dtd:These elements all have three optional attributes: lang, scheme, and sub. The lang attribute is the language in an ISO639-1 representation. The scheme attribute identifies the representational scheme. in The sub element represents subtypes, such as the Created subtype of the Date element. So, for example, the publication date might be represented as 
ThML10.dtd:Biography - it is a biography of a person, who is identified as shown below
ThML10.dtd:	- The target of a biography is identified by the CCEL personID
ThML10.dtd:	- CCEL standard identifier, as described below
ThML10.dtd:    <!-- document for entire series, e.g. a series-wide table of contents-->
ThML10.dtd:The way that ThML is used in the CCEL is in some cases more specific than the description above. For example, the <note> element merely identifies text as a note of some sort. The software used in the CCEL moves this text to the end of a section and links it with a footnote marker. Some of the special uses of ThML have been documented above; some others are given here.
ThML10.dtd:* CCEL URIs are of the form [http://www.ccel.org/]ccel/authorID/bookID[_version].fmt where ".fmt" identifies the desired format, e.g. .htm, .txt, .xml, etc. The URL may also have |id1 or #id2 appended. In the former case, only the element whose ID is id1 is returned. The latter case works like an HTML hash. Both forms may be combined. Three special IDs are also available: _TOC, a machine-generated table of contents, _About, a machine-generated page of information about the document, and _Pnnn the specified page.
ThML10.dtd:<comments>A copyright renewal search did not find record of copyright renewal for the source edition of this text.</comments>	
ThML10.dtd:<DC.Publisher>Grand Rapids, MI: Christian Classics Ethereal Library</DC.Publisher>
ThML10.dtd:6 Handheld computers may provide a better user interface than a desktop or laptop computer for reading books. Book-reading software and formats tuned for these devices will doubtless become common, if web browsers don't meet the need.
html-applet.mod:   width       %Length;       #REQUIRED
html-base.mod: <!ELEMENT bdo %Inline;>  <!-- I18N BiDi over-ride -->
html-base.mod: <!-- The TITLE element is not considered part of the flow of text.
html-entities.mod:   id       document-wide unique id
html-entities.mod:  "id          ID             #IMPLIED
html-entities.mod: <!ENTITY % media ""> <!-- override this to add media elements -->
html-forms.mod:  supplanted by a new forms module providing an improved
html-forms.mod:     file | hidden | image | button)"
html-forms.mod:   the fieldset element to avoid mixed content problems
html-frames.mod:   marginwidth %Pixels;       #IMPLIED
html-frames.mod:   marginwidth %Pixels;       #IMPLIED
html-frames.mod:   width       %Length;       #IMPLIED
html-img.mod:    To avoid accessibility problems for people who aren't
html-img.mod:    able to see the image, you should provide a text
html-img.mod:    In addition avoid the use of server-side image maps.
html-img.mod:   width       %Length;       #IMPLIED
html-img.mod:   or an external document, although the latter is not widely supported -->
html-img.mod: <!--================== Client-side image maps ============================-->
html-img.mod:      separate document although this isn't yet widely supported -->
html-loose.mod: <!-- There are also 16 widely known color names with their sRGB values:
html-loose.mod: <!ENTITY % IAlign "(top|middle|bottom|left|right)">
html-loose.mod:   width       %Length;       #IMPLIED
html-loose.mod:   width       %Number;      #IMPLIED
html-loose.mod:   id          ID             #IMPLIED
html-object.mod:   classid     %URI;          #IMPLIED
html-object.mod:   width       %Length;       #IMPLIED
html-object.mod:   id          ID             #IMPLIED
html-tables.mod:  and voice browsers. As a result, W3C hopes to provide an
html-tables.mod:  data explicit, thereby making it easier to provide renderings
html-tables.mod:  CALS to avoid a name clash with the VALIGN attribute.
html-tables.mod: <!ENTITY % TFrame "(void|above|below|hsides|lhs|rhs|vsides|box|border)">
html-tables.mod:   "valign     (top|middle|bottom|baseline) #IMPLIED"
html-tables.mod:   width       %Length;       #IMPLIED
html-tables.mod:   width       %MultiLength;  #IMPLIED
html-tables.mod:  The WIDTH attribute specifies the width of the columns, e.g.
html-tables.mod:      width=64        width in screen pixels
html-tables.mod:      width=0.5*      relative width of 0.5
html-tables.mod:   width       %MultiLength;  #IMPLIED
