TEI Documentation

The encoding conventions of the Exploring Medieval Mary Magdalene project follow the guidelines for electronic text encoding and interchange as laid out by the Text Encoding Initiative (TEI) in http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index.html. The project does not adhere to any customization of TEI but instead draws on the entire standard.

This documentation intends to provide an overview over the mark-up conventions used in encoding of each textual witness, outlining the basic structure common to each TEI document and accounting for all the elements and attributes used.

The TEI tagging of each text follows a principle of parsimony. Only elements occur that are necessary either from the perspective of the editorial principles for this project or from the perspective of the end-user version of each text.

Elements discussed in this document will be referred to with both its start- and end-tag in their standard TEI notation. Example: <element></element>. Elements that are usually empty are referred to with their simplified tag. Example: <lb/>. Attributes that occur within element tags are marked with the @ symbol. Example: @attribute.


TEI Document Structure


The root element for each TEI document in this project is <TEI xmlns=”http://www.tei-c.org/ns/1.0″></TEI>.

It contains the two high-level elements <teiHeader></teiHeader> and <text></text>.


1 TEI Header


The TEI header element <teiHeader></teiHeader> contains all the meta data relevant to each witness and its TEI encoding.

Each TEI header <teiHeader></teiHeader> contains three elements: <fileDesc></fileDesc>, <encodingDesc></encodingDesc>, and <revisionDesc></revisionDesc>.

  • <fileDesc></fileDesc> contains the bibliographic description of the TEI document, including publication statement and source description.
  • <encodingDesc></encodingDesc> provides declarations about the relationship between encoding and the source document.
  • <revisionDesc></revisionDesc> serves to summarize the revision history of each TEI document.

These elements each contain a number of elements that contain the specific information in question. Some of this information is identical in all TEI documents within the project.

Below is an example header, containing the information shared by all TEI documents. Variable information is reflected with italics.




<title>Mary Magdalene Conversion Legend</title>

<author ref=”#PseudoIsidore”>Pseudo Isidore</author>


<name xml:id=”NS“>Name of Student</name>

<resp>TEI encoder</resp>





<title>Mary Magdalene Conversion Legend – Digital Edition</title>

<date when=”2017-08-25“>August 25, 2017</date>





<ref target=””></ref>


<availability status=”free“><p>Published under <ref target=””></ref></p></availability>






<repository>Staatsbibliothek Preußischer Kulturbesitz</repository>

<idno>Ms. germ. quart. 261</idno>




<p>Charterhouse St Barbara

<origPlace>Cologne</origPlace> in the

<origDate  notAfter=”1600” notBefore=”1500“>16th cent.</origDate>






<witness xml:id=”B1“>B1</witness>





<schemaRef n=”lbp-critical-1.0.0″ url=”https://raw.githubusercontent.com/lombardpress/lombardpress-schema/develop/src/out/critical.rng”/>


<p>Encoding of this text has followed the published guidelines on…</p>



<revisionDesc status=”draft“>


<change when=”2017-08-28” status=”draft” n=”0.0.0″>

<p>Change date when editing file for the first time</p>


<change when=”2017-08-25″ status=”draft” n=”0.0.0″>

<p>Created file for the first time.</p>






2 Text


2.1 Basic structure


The basic element for the document itself is <text></text>. Since each document contains the transcription of a single text from one witness, <text></text> only contains the <body></body> element.


2.1.1 Body


Within the <body></body> element, the anonymous block element <ab></ab> is used as the general container for all text segments. Since the witnesses themselves do not display their repsective text divided into sections, the paragraph element <pb></pb> would be too semantically loaded.

Each transcription contains 17 <ab></ab> elements, thus deviding the text into 17 parts. Each <ab></ab> contains a unique @xml:id, as represented in the list below.

  1. <ab xml:id=”legend_title”></ab>
  2. <ab xml:id=”genealogy”></ab>
  3. <ab xml:id=”heritage”></ab>
  4. <ab xml:id=”MaryMagdalene_mismanagement”></ab>
  5. <ab xml:id=”Martha_confronts_Mary”></ab>
  6. <ab xml:id=”MaryMagdalene_lovers”></ab>
  7. <ab xml:id=”market_scene”></ab>
  8. <ab xml:id=”Mary_at_Marthas_house”></ab>
  9. <ab xml:id=”Jesus_enters”></ab>
  10. <ab xml:id=”Mary_retreats”></ab>
  11. <ab xml:id=”Martha_encourages_Mary”></ab>
  12. <ab xml:id=”Martha_in_Simons_house”></ab>
  13. <ab xml:id=”Gregory_comment”></ab>
  14. <ab xml:id=”Mary_in_Simons_house”></ab>
  15. <ab xml:id=”VirginMary_visits”></ab>
  16. <ab xml:id=”Mary_rest_of_life”></ab>
  17. <ab xml:id=”epilogue”></ab>

The following elements are used within the <ab></ab> element: <head></head>, <pb/>, <cb/>, <lb/>, <app></app>. Title


The <head></head> element is used once in each document in order to mark the title of the text in each witness. Folio Beginnings, Column Beginnings


The <pb/> element indicates folio beginnings. The <cb/> element indicates column beginnings on each folio.

Every <pb/> and <cb/> contain the following attributes: @edRef, @facs, @n.

@edRef notes the source of the folio and column beginning by pointing to the respective manuscript ID. For instance, @edRef=”#B1” points to the manuscript with the ID B1.

@facs is used to point to the manuscript image of the corresponding folio.

@n provides a label for each folio and column beginning. This is used to display the folio and column number within the end-user version of the text.

Even if a witness does not have columns, each <pb/> element is followed by a <cb/> element which contains a @n attribute containing the folio number. This solution has been chosen to provide a unified way of fetching the folio (or column) numbers for display in the end-user version. Line Beginnings


Each line beginning is marked by the <lb/> element.

Every <lb/> contains a @facs attribute.

@facs serves to identify the specific line in each manuscript image. The numbering systematically follows the structure manuscript number_folio number_column number_line number. For instance, <lb facs=”#B2_159r_a_13″/> is line 13, column a, folio 159r from manuscript B2.


2.2 Specific Tagging

2.2.1 Abbreviations

 All abbreviations found in each witness are tagged with the <abbr></abbr> element. Special characters used within abbreviations are represented by using decimal (&#…) or hexadecimal (&#x…) Unicode notation. Below is a list of all abbreviations occurring in the edition.


Symbol Displaying Decimal (&#…) Hexadecimal (&#x…) Description
◌̃ 771 303 Combining tilde
◌̄ 772 304 Combining macron
◌̅ 773 305 Combining overline (use when multiple neighboring characters have a stroke over them)
◌̒ 786 312 Combining turned comma above
◌̛ 795 31B Combining horn (use for latin “er” abbreviation)
7511 1D57 Modifier Letter Small T (superscript)
42833 A751 “p” with stroke through descender (i.e. Latin “per” abbreviation)
42858 A76A Upper Case Latin “et” (use for final sideway “m”)
42864 A770 Modifier letter “us”  (DON’T USE THIS, use A76F with <sup>)
 42835 A753 Latin Small Letter P with Flourish
 φ  966 03C6 Greek Small Letter Phi
42863 A76F Latin Small Letter Con
42845 A75D Latin Small Letter Rum Rotunda
42841 A759 Latin Small Letter Q with Diagonal Stroke
 · 183 00B7  Middle Dot
̌ 780 030C Combining Caron



The resolution to an abbreviation is tagged with the <expan></expan> element.

Each pair of <abbr></abbr> and <expan></expan> elements are embedded in a <choice></choice> element in order to allow display of the end-user text either with abbreviations, or with resolved spelling only.







2.2.2 Highlighting


Highlighted initials are tagged using the <hi></hi> element. Each <hi></hi> element contains a @rend attribute, indicating the specific kind of highlighting present. Possible values are: 


rend=”rubr” [information]

rend=”decor” non-specific but special decoration


2.2.3 Proper Names


All proper names found in each witness are tagged with the <name></name> element.

Every <name></name> contains at least the @type and the @nymRef attribute. Occasionally the @role and the @subtype attribute are used to further clarify the entity in question. Occasionally, proper names are highlighted by special visual features; in this case, the @rend attribute is included for specification of the visual feature.

@type declares the type of entity the name points to. Values used in this edition are “person”, “org” and “place”.

@subtype further specifies the entity the name points to.

@nymRef points to the canonical form of the name in question.

@role further specifies the entity the name points to.

@rend characterizes special visual features of the name in question. Possible values are:





rend=”decor” non-specific but special decoration


Below is a list of all proper names tagged with <name></name>.


Mary Magdalene:         <name type=”person” nymRef=”Mary_Magdalene”></name>

Martha:                        <name type=”person” nymRef=”Martha” ></name>

Eucharia:                     <name type=”person” nymRef=”Eucharia”></name>

Syrus:                          <name type=”person” nymRef=”Syrus”><name>

David:                          <name type=”person” nymRef=”David></name>

Lazarus:                       <name type=”person” nymRef=”Lazarus”></name>

Luke:                           <name type=”person” nymRef=”Luke”></name>

Virgin Mary:                  <name type=”person” nymRef=”Virgin_Mary”></name>

Isidore:                         <name type=”person” nymRef=”Pseudo_Isidore”></name>

Jesus Christ:                <name type=”person” nymRef=”Jesus_Christ”></name>

Martilla:                        <name type=”person” nymRef=”Martilla”></name>

Augustinus:                  <name type=”person” nymRef=”Augustine”></name>

Joseph                           <name type=”person” nymRef=”Joseph”></name>

Tyberius:                      <name type=”person” role=”emperor” nymRef=”Tiberius” ></name>

Gregory:                       <name type=”person” nymRef=”Gregory” ></name>

Joachim:                      <name type=”person” nymRef=”Joachim”></name>

Simon the Pharisee:         <name type=”person” nymRef=”Simon” ></name>

Peter:                           <name type=”person” nymRef=”Peter” ></name>

Mother of God:             <name type=”person” nymRef=”Virgin_Mary”></name>

King of Kings:              <name type=”person” nymRef=”King_of_Kings”></name>

God:                            <name type=”person” nymRef=”God”></name>

Holy Spirit                    <name type=”person” nymRef=”Holy_Spirit”></name>

Mary Wife of Zebedee <name type=”person” nymRef=”Mary_of_Zebedee”></name>

Mary Wife of Jacob     <name type=”person” nymRef=”Mary_of_Jacob”></name>

Children of Israel:         <name type=“org” subtype=“religion” nymRef=“Children_of_Israel”></name>

Tribe of David:              <name type=“org” subtype=“religion” nymRef=“Tribe_of_David”></name>

Jerusalem:                   <name type=”place” nymRef=”Jerusalem”></name>

Jerusalemites:             <name type=”place” subtype=”residents” nymRef=”Jerusalem”></name>

Bethany:                                  <name type=”place” nymRef=”Bethany”></name>

Magdalum:                   <name type=”place” subtype=”residence” nymRef=”Magdalum”></name>

Constantinople:                        <name type=”place” nymRef=”Constantinople”></name>

Israhel:                         <name type=”place” nymRef=”Israel”></name>

Galilee:                        <name type=”place” nymRef=”Galilee”></name>


2.2.4 Punctuation


All punctuation, either original or supplied, is tagged with the <pc></pc> element. Four scenarios are possible:

  • Punctuation occurs in the witness. The editor accepts it into the edition. In this case, the <pc></pc> element contains a @source attribute with the value “manuscript”:

<pc source=”manuscript”>.</pc>

  • No punctuation occurs in the witness. The editor supplies the punctuation. In this case, the <pc></pc> element contains a @resp attribute with the value “editor”:

<pc resp=”editor”>.</pc>

  • Punctuation occurs in the witness. The editor emends the punctuation. In this case, the <pc></pc> element contains a <choice></choice> element which itself contains the original punctuation tagged with the <orig></orig> element and the editor´s emended punctuation tagged with the <reg></reg> element; the reg element contains a @resp attribute with the value “editor” as well as a @type attribute with the value “punct”:




<reg resp=”editor” type=”punct”>,</reg>



  • Punctuation occurs in the witness. The editor does not accept it into the edition. In this case, the <pc></pc> element contains a <choice></choice> element which itself contains the original punctuation tagged with the <orig></orig> element and an empty <reg></reg> element; the reg element contains a @resp attribute with the value “editor” as well as a @type attribute with the value “punct”:




<reg resp=”editor” type=”punct”> </reg>




2.2.5 Metamarks


Two special kinds of metamarks appear in the witnesses: line fillers and hyphens used for syllabification. Both are tagged with the <metamark> element. Each <metamark> element contains the @type, @function, and @source attributes. For line fillers, the value of @function is “line_filler”, for syllabification, the value for @function is “word_division”. For line fillers, the value of @type may vary in accordance to the specific rendition of the line filler; for syllabification, the value for @type is “hyphen” or “double_hyphen”. The value for @source is “manuscript”.


2.2.6 Uncertain Readings


To represent uncertain readings of the original, the element <unclear></unclear> is used, containing the most likely reading. The reason for the uncertainty may be expressed with the @reason attribute. Possible values may include “illegible”, “rubbing” etc. The degree of certitude may be expressed with the @cert attribute which may carry the values “low”, “medium”, and “high”.


2.2.7 Modifications of the Text


Any modification of the text in a witness is tagged as a specific element in each TEI document. Such modifications include both those that are found in the witness itself, be it through a third party, damage, etc., and those that are based on editorial choices. Original


The following instances of modifications of the witness itself have been encountered in the course of this project and are tagged accordingly. Deletions


Deletions occurring in the original are tagged with the <del></del> element. Each <del></del> element contains a @rend attribute, indicating the specific rendition of the deletion. Possible values are:




rend=”adapted” [does that really work here? Additions


Additions in the original are tagged with the <add></add> element. Each <add></add> element contains a @place attribute, indicating the specific placement of the deletion. Possible values are:





place=”interlinear” Hand


In the case that several hands con be identified, the <add></add> element may also include a @hand attribute, identifying the hand responsible for the addition. Editor


Each TEI document reflects a number of editorial interventions that were deemed necessary by the respective editor in the course of transcribing and encoding the respective witness. Emendations


Systematic emendations are tagged with the <reg></reg> element. The original reading is tagged with <orig></orig>. Both are embedded in a <choice></choice> element. Each <reg></reg> element contains a @type attribute, indicating the specific type of emendation, and a @resp attribute with the value “editor”. The most common instance of this is normalization of capitalization; the @type attribute then has the value “capit”.





<reg type=”capit” resp=”editor”>L</reg></choice>azarus Corrections


Non-systematic corrections are tagged the <corr></corr> element. The original reading is tagged with <orig></orig>. Both are embedded in a <choice></choice> element. A <corr></corr> element may contain a @type attribute, indicating the reason for the correction.

If the correction is merely based on a non-standard spelling in the original, the <sic></sic> element is used instead of <orig></orig>.





</choice> Supplied Omissions


If the editor identifies an obvious (erroneous) omission of text, the corresponding text is supplied using the <supplied></supplied> element.


2.2.8 Critical Apparatus Variants


The encoding of the critical apparatus is currently in its initial stages. Therefore, only a basic outline of the tagging conventions can be given at this moment.

A critical apparatus tag has the following basic structure:

<app><rdg wit=”#ms id”>reading from other ms</rdg></app>

Each <rdg></rdg> element contains exactly one reading from another witness, with @wit pointing to its respective ID.

Additionally, the <app></app> element may contain a <note></note> element providing additional comments by the editor. Missing in other witnesses

In the case of missing passages in other witnesses, the <app></app> element cannot contain readings. This holds also true on the level of TEI syntax, because the <rdg></rdg> element should not be empty.


In this case, a <note></note> element is embedded within the relevant <app></app> element stating that text is missing. Associating the Critical Apparatus to the Text


In order to give a precise location reference for a critical apparatus entry within a text, different methods may be used. In the case of a single word reference, the <app></app> element is placed around it. The word in question is then tagged with the <lem></lem> element. The <rdg></rdg> elements follow after the <lem></lem> element. Example:



<rdg wit=”#X”>that</rdg>

</app> word


Note that this method by default displays as “this word”, i.e. the <lem></lem> element marks its content as belonging to the original text that the apparatus comments on.

Alternatively, the range of an <app></app> element may be defined using the attributes @from and @to. If an apparatus entry concerns a single word, only the @from attribute is used, since it automatically only extends to the one entity pointed at within it. If an apparatus entry concerns a (uninterrupted) sequence of words, both @from and @to are used.

Both attributes @from and @to fetch unique xml IDs that have to be defined for that purpose. Typically, these IDs are defined within a <w></w> element which is used to tag single words. The IDs follow the established logic already mentioned above: manuscript number_folio number_column number_line number_word number.


<w xml:id=”S_350r_a_14_05”>word1</w>



<w xml:id=”S_350r_a_14_08”>word4</w>

<app from=”#S_350r_a_14_05” to=”#S_350r_a_14_08”>

     <rdg wit=”#xyz”>instead these words</rdg>