We use TEI

Ch. 14. The header

14.1 Introduction
14.2 The file description: Title, edition, extent and publication
14.3 Source description: Manuscript description
14.3.1 Manuscript identifier
14.3.2 Intellectual content
14.3.3 Codicological features
14.3.4 The history of the manuscript
14.3.5 Other information
14.3.6 Mark-up for names of persons, places and institutions as well as bibliographical references
14.4 The encoding description
14.5 The profile description
14.6 The revision description
14.7 Example headers

Version 3.0 beta

This is a preliminary version which can be changed or updated at any time.
The whole chapter has been revised by Beeke Stegmann.


14.1 Introduction

This chapter deals with the first major part of any Menota XML file, the header. The header should describe the file so that meta level information about the text itself, its source, its encoding and its revisions are sufficiently documented. The header has four major parts:

Elements / attributes Contents
<fileDesc> A file description
<encodingDesc> An encoding description
<profileDesc> A text profile
<revisionDesc> A revision history

This chapter will discuss the recommended (minimal) amount of information for each of the four parts. Due to practical considerations, however, the <fileDesc> is treated in two separate chapters, one dealing with the meta-level information on the file history, the other concerning the source from which the XML-file is created, i.e. in our case usually the manuscript.

14.2 The file description: Title, edition, extent and publication

The file description is a mandatory part of the header and must include information on the title, on the publication and on the source, cf. ch. 2.2 “The File Description” of the TEI P5 Guidelines. It contains a number of elements, several of which were discussed in ch. 10 above (<name>, <persName>, <forename>, <surname> and <addName>).

The first part of the <fileDesc> supplies the necessary bibliographical information about the digital text in respective title, edition, extent and publication statements. The main editor(s) should always be identified, but since many electronic editions are the result of a teamwork (or of an accumulative work) all major contributors should be listed. Please note: The description of the source for tanscription (done in the element <sourceDesc>) is also part of the <fileDesc>. However, in order to improve legibility, that part is treated in a separate subchapter (see ch. 14.3 below).

14.2.1 Title statement

Elements / attributes Contents
<titleStmt> Information on the title, editor and other people who have been responsible for the edition
<title> The title of the work
<editor> The name of the editor of the encoded work
   @role The role of an editor, in particular in case the editor is an institution or project
<orgName> The name of an organisation
   @type with the value 'affiliation' Specifies that the organisation is the institution with which the editor is affiliated
<respStmt> A statment of responsibility
<resp> Type of responsibility, e.g. transcription, conversion, proof-reading

In the <titleStmt>, the <title> element gives the title of the document. In the case of a digital transcription, it should specify the primary source (manuscript) on which the transcription is based, and, where applicable, the title of the work transcribed. We recommend that the title also states that the present text is an electronic edition. In single-text manuscripts, the title may look like this:

<title>Holm perg 6 fol : Barlaams ok Josaphats saga : an electronic edition</title>

In multi-text manuscripts, the title will by necessity be somewhat longer:

<title>AM 242 fol (Codex Wormianus) : Snorra-Edda, the four grammatical treatises,
Rígsþula, Maríukvæði, and ókennd heiti : an electronic edition</title>

The full list of titles for the works treated in the document will be given in the <sourceDesc> (see ch. 14.3 below), meaning that it is not necessary to include all details in the document title.

Since manuscripts sigla are sometimes given according to various standards (for instance is “Holm perg 6 fol” in many contexts referred to as “Sth. perg. 6 fol”.), for Old Norse sources we recommend using the sigla in the index volume of Ordbog over det norrøne prosasprog (also accessible on the ONP's web page at http://onpweb.nfi.sc.ku.dk/mscoll_d.html). The manuscript siglum should always be given in full.

In addition to the title, the <titleStmt> must also list the editor(s) and other contributors to the edition. We recommend that one or more people (or institutions) are identified as the main editor(s) of the text in the <editor> element.

    <orgName type="affiliation">University of Bergen</orgName>

Multiple main editors should either be listed either in alphabetical order, if equally responsible, or in order of importance concerning the editing work. Institutions as well as individuals may be given as editors:

<editor role="institution">
    <orgName>Gammelnorsk Ordboksverk / Enhet for digital
<editor role="person">
    <orgName type="affiliation">University of Oslo</orgName>

In this example, the attribute @role explains the fact that the first editor is an institution rather than an individual. For the second editor, @role with the value 'person' is optional, but may be inserted for clarity. If an <editor> element does not have any @role attribute, it is asumed to describe a single person.

If other people than the main editor(s) contributed to the work, this is specified in one or more responsibility statements (<respStmt>). In most cases it will be preferable to use one <respStmt> per task or set of tasks (rather than per contributer) and list all the contributers that worked on this specific task or tasks. In order to clarify the divison of work and responsibility, the main editors should also be added to the relevant responsibility statements.

  <resp>Lemmatisation and morphological encoding</resp>
        <forename>Jon Erik</forename>
      <orgName type="affiliation">University of Bergen</orgName>

  <resp>Conversion to Menotic XML</resp>
        <forename>Christian Emil</forename>
      <orgName type="affiliation">University of Oslo</orgName>

If contributors are responsible for several activities, this may be specified in more than one <resp>-element within this <respStmt>, listed in chronological order. Note, however, that if more than one contributer is listed in a single responsibility statement, this will always be interpreted as all listed contributers worked on all the tasks given in the <respStmt> in question. If this i not the case, multiple responsibility statements should be used.

  <resp>Transcription of primary source</resp>
  <resp>Conversion to XML</resp>
        <forename>Karl G.</forename>
      <orgName type="affiliation">University of Oslo</orgName>

As discussed in ch. 10 above, patronymica should not be encoded as surnames, but rather as additional names:

    <addName type="patronym">Bernharðsson</addName>
  <orgName type="affiliation">University of Iceland</orgName>

When listing persons in alphabetical order, a surname should be given before any forenames, e.g. “Rindal, Magnus”. In the absence of a surname, a forename is given before an additional name, e.g. “Haraldur Bernharðsson”.

The TEI P5 Guidelines also recommend that the element <author> is included in the <titleStmt> (ch. 2.2.1 “The Title Statement”). Since almost all Medieval Nordic texts are anonymous we believe this element is not required.

14.2.2 Edition statement

Elements / attributes Contents
<editionStmt> A statment of the edition
<edition> A description of the edition (i.e. version), typically by means of a number
   @n The number of the edition

The <editionStmt> should be used to specify whether the present text is a new or a revised edition of the electronic text as described in the title statement above. Here, “edition” is to be understood as “version”. The version number should be given in the @n attribute with the usual number system, i.e. 1.0, 1.0.1, 1.1, 1.2, etc., while the date of the version should be given in the format year-month-day in the attribute @when, e.g. '2004-02-01' .

A complete edition statement may be as simple as this:

  <edition n="1.0">First draft, <date when="2004-02-01">
       1 February 2004</date>.</edition>

14.2.3 Extent

Elements / attributes Contents
<extent> The size of the file, preferably specified in words
   @n The number of words (or any other measure)

The <extent> element specifies the size of the file. The exact number of words should be given in the @n attribute as well as in plain text within the element, e.g.:

<extent n="76411">76411 words</extent>

14.2.4 Publication statement

Elements / attributes Contents
<publicationStmt> A statment of the publication
<distributor> A reference to the distributor, e.g. Medieval Nordic Text Archive
<idno> A reference (identification number), e.g. “Ms. 1”
   @type The type of reference, e.g. 'Menota'
<date> The date for the publication of the edition
   @when The date in the year-month-day format, e.g. 2017-03-08
<availability> A description of the conditions for the distribution and use of the text
   @status The type of availability, typically with the values 'free' , 'restricted' or 'unknown' .

The <publisher> element specifies the body (publisher, archive) which has made the text available, e.g. the Medieval Nordic Text Archive (Menota).

The <idno> is a unique identification of the text. For texts in the Menota archive the attribute value will be Menota, and the contents of the element will be an acquisition number, beginning with Ms. 1. Note that this information will be supplied by Menota, if the text is being deposited in this archive.

The <availability> element speficies the accessability of the text. We recommend adding a @status attribute with one of the three values “free”, “restricted”, “unknown” (cf. ch. 2.2.4 “Publication, Distribution, etc.” of the TEI P5 Guidelines). Further specifications can be added in a <p> element. Texts in the Menota archive typically a restricted availability, and we recommend adding this description: “This text is available for purposes of academic research and teaching only. Re-distribution in any form without prior permission is prohibited. Short extracts may be cited with full acknowledgment of the source.” A complete publication statement may thus look like this:

  <distributor>Medieval Nordic text Archive</distributor>
    <idno type="Menota">Ms. 1</idno>
    <date when="2004-03-01">1 March 2004</date>
  <availability status="restricted">
    <p>This text is available for purposes of academic research and
       teaching only. Re-distribution in any form without prior
       permission is prohibited. Short extracts may be cited with full
       acknowledgment of the source.</p>

14.3 Source description: Manuscript description

The <sourceDesc> is a mandatory part of the header and describes the source material (cf. ch. 2.2.7 “The Source Description” of the TEI P5 Guidelines). It is a child of <fileDesc>, and in the case of a digital edition, the source is the manuscript carrying the transcribed text. Therefore, the source is described using the element <msDesc> (manuscript description), which is placed within <sourceDesc>. With TEI P5, this part of the header includes specific elements for manuscript description, based chiefly on the work of the EU-funded MASTER project (1999-2001) and the TEI Medieval Manuscripts Description Work Group (1998-2000). For more detailed information on the manuscript description module, see ch. 10 “Manuscript Description” of the TEI P5 Guidelines.

The <msDesc> element is the framing element into which the manuscript description is put. The description needs not consist of more than the basic information necessary to identify the source, i.e. its location, both geographical and institutional, and its shelfmark or other identifying number or name (e.g. Oslo, Universitetsbibliotek, UB 1042 8vo). However, it is also possible to provide a detailed description of the source, analagous to what one would find in a thorough catalogue record or in the introduction to a scholarly edition. (Note that while the <msDesc> element will normally appear within <sourceDesc> in the document header, it can also appear anywhere within the body of a TEI conformant document, in the same way as the bibliographic elements <bibl>, <biblStruct> and <biblItem>.)

Within <msDesc> the following seven elements are available, of which only the first is required:

Elements Contents
<msIdentifier> Groups information that uniquely identifies the manuscript, i.e. its location, holding institution and shelfmark.
<msContents> Contains an itemised list of the intellectual content of the manuscript or manuscript part, either as a series of paragraphs or as a series of structured manuscript items, possibly including transcriptions of rubrics, incipits, explicits etc., as well as primary bibliographic references.
<physDesc> Groups information concerning all physical aspects of the manuscript or manuscript part, its material, size, format, script, decoration, binding, marginalia etc.
<history> Provides information on the history of the manuscript or manuscript part, its origin, provenance and acquisition by its holding institution.
<additional> Groups other information about the manuscript, in particular, administrative information relating to its availability, custodial history, surrogates etc.
<msPart> Contains in essence a nested <msDesc>, in cases of composite manuscripts now regarded as constituting a single unit but made up of two or more parts which were originally physically distinct; since the contents, physical description and history of the individual parts will normally be quite different, an <msPart> element can contain all the elements listed here, including additional <msPart> elements.

Within each of these elements a number of sub-elements is available; <msContents>, for example, will normally consist of one or more <msItem> elements, each in turn containing specific elements such as <title>, <author>, <locus>, <rubric>, <incipit>, <explicit> or <colophon> (see ch. 14.3.2 below). Technically, the contents need not be this structured, since with all the elements listed above, apart from <msIdentifier>, there is also the option of using ordinary prose, marked up with the <p> element. However, doing so would limit greatly the possibilities both for processing and searching the data. On the other hand, it could be preferable when dealing with pre-existing descriptions (so-called “legacy data”), the exact form of which one may wish, or be required, to maintain. If at all possible, we recommend to use the avaialable specific elements to structure and mark-up the manuscript description.

14.3.1 Manuscript identifier

The only mandatory element within <msDesc> is <msIdentifier>. For <msIdentifier>, a number of sub-elements is available, among others, <country>, <region>, <settlement> (the TEI term for what most people would call city), <institution>, <repository>, <collection> and <idno> (an identifying number, here used for the shelfmark of a manuscript). Although not required, it is strongly recommended that at least the elements <settlement>, <repository> and <idno> are included, since they provide what is, by common consent, the minimum amount of information necessary to identify a manuscript. In many cases, no other elements are needed, as common sense will suffice to distinguish, say, Paris, France from Paris, Texas, as the location of the Bibliothèque Nationale. For search purposes, however, it is probably a good idea to include as much information as possible, such as <country> and, where applicable, <region>.

There are two further sub-elements of <msIdentifyer>: <altIdentifier>, which contains an alternative, structured identifier of a manuscript, such as a catalogue number or former shelfmark, and <msName>, which contains any form of unstructured alternative name used for a manuscript, such as a nickname. The manuscript Uppsala, Universitetsbibilioteket, DG 1, for example, is far better known under the name Codex Argenteus or the Silver Bible. There are many examples of such nicknames among Nordic manuscripts, for instance in the Arnamagnæan Collection, as Árni Magnússon frequently gave his manuscripts names based on the places where they had been made or where he got them from or the people whom he knew to have produced or possessed them. Occasionally a manuscript can have several such names, in which case multiple <msName> elements are used, or perhaps rather several forms of the name, typically in different languages. The latter can be distinguished from each other by means of the @xml:lang attribute, which is available on all TEI elements.

A typical <msIdentifier> for a manuscript in the Arnamagnæan Collection looks like this:

  <country key="DK">Danmark</country>
  <repository>Den Arnamagnæanske Samling</repository>
  <idno>AM 45 fol.</idno>
  <altIdentifier type="KKKat">
  <msName type="nickname" xml:lang="la">Codex Frisianus</msName>
  <msName type="nickname" xml:lang="is">Fríssbók</msName>

The value of the @key attribute on <country>, which is the standard international two-letter code, is useful for search purposes, enabling one to find all manuscripts in Danish repositories regardless of whether the cataloguer has given the name of the country as “Denmark” or “Danmark” (or, for that matter, “Dänemark”, “Dinamarca” or “Дания”). There are many such attributes in the manuscript description tagset which allow for cross-language searches. In addition to the commonly used shelfmark, this example also has an <altIdentifier>, giving the running number from Kålund's printed catalogue. That this reference is to Kålund's catalogue can be seen from the @type attribute with the abbraviation “KKKat” as its value.

14.3.2 Intellectual content

Detailed description of a manuscript’s intellectual contents is put in the <msContents> element, which is the next major sub-element of <msDesc>. <msContents> consists of one or more <msItem> elements, which can be prefaced, if desired, by a <summary> element.

An <msItem> element typically contains at least the elements <locus> and <title> to specify the location in the manuscript and the title of the text in question. More detail can be provided by means of further elements, such as <author>, <incipit>, <explicit>, <rubric>, <finalRubric>, <colophon>, <textLang> and <note>. <msItem> elements are further allowed to “nest”, by which is meant that an <msItem> can contain other <msItem> elements. This is useful where separate items (or subitems) in a manuscript are grouped under a single title or rubric, for example in collections of prayers.

A @defective attribute, with possible values of 'true' , 'false' , 'unknown' or 'unspecified' , is available on <msItem>, providing the means of distinguishing between texts which are fragmentary and those which are not. The attribute is also available on the specialised elements <incipit> and <explicit>. When dealing with collections of fragments, each fragment may be given as a separate <msItem> and the first and last words of each transcribed as defective incipits and explicits, as in the following example, a manuscript containing four fragments of a single work:

  <msItem n="1" defective="true">
      <locus from="1r" to="9v">1r-9v</locus>
      <title>Knýtlinga saga</title>
          <msItem n="1.1">
              <locus from="1r:1" to="2v:30">1r:1-2v:30</locus>
              <incipit defective="true">dan<ex>n</ex>a a engl<ex>an</ex>di</incipit>
              <explicit defective="true">en meðan har<ex>aldr</ex>
               hein hafði k<ex>onung</ex>r v<ex>er</ex>it yf<ex>ir</ex>
     <!-- msItems 1.2 to 1.4 -->

The standard TEI element <bibl> (and the grouping parent element <listBibl>) is also available within <msItem>. This should be used to provide bibliographical information on the <msItem> level, i.e. concerning editions of the item in question. Bibliographical information pertaining to the manuscript as a whole should be placed in the <additional> element, described below (see ch. 14.3.5).

14.3.3 Codicological features

The following major element in <msDesc> is <physDesc>, i.e. physical description. The first element within <physDesc> is <objectDesc>, which relates specifically to the text-bearing object and contains two further sub-elements, <supportDesc> and <layoutDesc>.

<supportDesc> can contain various elements relating to the physical object, or carrier, on which the text is inscribed, such as <support>, describing whether the text is written on parchment, paper etc. and a description thereof; <extent>, the number and size of leaves; <foliation>, how and, if known, when and by whom the manuscript has been paginated/foliated; <collation>, a description of the quire structure, any missing leaves and so on; and <condition>, for a description of the present physical state of the manuscript. <layoutDesc> contains one or more <layout> elements, detailing the way(s) in which the text is organised on the page, specifying the number of columns, the dimensions of the written area, the number of lines per page/column etc. The example below shows a simple <objectDesc>. Please not that this <objectDesc> only makes use of selected elements.

    <objectDesc form="fragment">
        <extent>2 leaves; 220 mm (height) by 150
          mm (width).</extent>
          <p>The manuscript is foliated: 1-2.</p>
        <collation>One bifolium.</collation>
        <layout columns="1" writtenLines="23 26">
        <p>The text is written in one column of 23-26 lines each;
         there are approxinately 10-12 words per line;
         the written area measures 205 mm (height) by 130 mm (width).</p>

The second group of elements within a structured physical description concerns aspects of the writing, illumination or other notation (notably, music) found in a manuscript, including additions made in later hands – the text, as opposed to the carrier. Possible elements are: <handDesc>, usually containing one or more <handNote> elements, <decoDesc>, containing one or more <decoNote> elements, as well as <musicNotation> and <additions>, both containing one or more paragraph elements.

The <handDesc> element is intended for a description of the scribal hand or hands of the manuscript. This may simply be encoded as one or more <p> elements, but more commonly, the various paragraphs are structred as a series of <handNote> elements, each containing a prose description of one of the hands. The following is an example of a brief <handDesc> element for a manuscript written by a single scribe:

<handDesc hands="1">
  <p>Written in <term type="script">Gothic hybrid</term>. 
   The scribe is unknown, but the same hand is found in sections 
   of AM 23 4to and GkS 25 fol.</p>

Please note: The use of the TEI element <term> with its attribute @type as in the example is optional. Such encoding allows for more precise searching than would be possible with free text, but is obviously dependent on there being a commonly agreed taxonomy.

The next example (below) is taken from a manuscript written by six different scribes, where a <handNote> element is used to describe each individual hand. The @script attribute on <handNote> is employed to indicate the type of script. Note also the @scope attribute, with the possible values of 'major' , 'minor' and 'sole' . Finally, the <locus> element, which we have alrady encountered in <msItem>, is used to indicate specifically which parts of a manuscript are written in a given hand.

<handDesc hands="6">
  <handNote script="Hybrida" scope="major">
    <p>The main hand (Hand 1) writes <locus>ff. 1r-9r 
    and 16r-118v</locus> in a practised Gothic hybrid.</p>
  <!-- more handNote elements follow here -->

As each <handNote> element may contain one or more paragraphs, there is no limit to the amount of information which may be given on any single hand. Thus, a detailed analysis of palaeographical and orthographical features (“/a/ is of the two-storey kind” etc.) is perfectly possible within this overall structure.

There is a corresponding element, <decoDesc> for the description of illumination and other decorational features in the manuscript. <decoDesc>, like <handDesc>, may simply contain one or more paragraphs, but ideally, it consists of a sequence of topically organised sub-elements, called <decoNote>, each describing either a decorative component of a manuscript (e.g. a single illuminated initial) or a homogenous class of such components (e.g. sentence initials in general). The example blow shows the usage of multiple <decoNote> elements in <decoDesc>. For detailed step-by-step guidelines on the description of illuminations using <decoNote> see ch. 7.2.

    <p>F. <locus from="1v" to="1v">1v</locus>: full-page illumination, 
      Holy King <name type="person">Óláfr</name>. 
      Caption: <q>Olafur · Haraldz son · Noreks kongur</q>. 
      Colours: yellow, dark green, light red (vermillion), outline: dark brown.</p>
   <p>F. <locus from="3r" to="3r">3r:24-36</locus>: major pen flourished initial L.
Colours: blue, light red.</p>
   <p>F. <locus from="7r" to="7r">7r</locus>: bas-de-page, 
two dogs running, a lamb and a bird.
Colours: black, dark red, white(?).</p>
   <p>Ff. <locus from="1r" to="27v">1r-27v</locus>: sentence initials were added later,
potentially by <name type="person">Jón Jónsson</name></p>

If a manuscript contains musical notation, the element <musicNotation> may be used to describe it. The form, and possibly location, of such musical notation is specified using one or more paragraphs:

  <p>Fol. <locus>34v</locus>: Square notation on 4-line red staves.</p>

Finally, the <additions> element is used to list or describe any marginalia or other additions to the manuscript which may be considered of interest or importance. Such additons can additionally be referred to and discussed elsewhere, for example as part of the <history> element in cases where the marginalia provide evidence of ownership (see ch. sec14.2.2.4). In those cases we recommend to describe (and potentially transcribe) the marginalia in detail in the <additions> element, while merely refering to them in other contexts.

<p>The manuscript contains the following marginalia:
    <item>Fol. <locus>4v</locus>, left margin: <q xml:lang="is">hialmadr
     <ex>ok</ex> <lb/>brynjadr</q>, in a fifteenth-cenury hand, imitating
     an addition made to the text by the scribe at this point.</item>
    <item>Fol. <locus>5r</locus>, lower margin: <q xml:lang="is">
     þ<ex>e</ex>tta þiki m<ex>er</ex> v<ex>er</ex>a gott blek en<ex>n</ex>da
     kan<ex>n</ex> ek ecki betr sia</q>; fifteenth-century hand, probably
     the same as that on the previous page.</item>
    <item>Fol. <locus>9v</locus>, bottom margin: <q xml:lang="is">þessa
     bok uilda eg <sic>gæt</sic> lært med<lb/>an Gud gefe myer Gott ad
     <lb/>læra</q>; seventeenth-century hand.</item>

It should be noted that the various mechanisms for the transcription of primary sources described elsewhere in this handbook – expansion of abbreviations and so on – may be employed here as well.

The third group of elements in <physDesc> pertains to things such as the binding and other material, that might be attached to the manuscript or stored with it. These elements include <bindingDesc>, containing a description of the state of the present as well as potential former bindings of a manuscript (given either as a series of paragraphs or as one or more distinct <binding> elements); <sealDesc>, which supplies information about the seal(s) attached to a document or charter (again either as paragraphs summarising the overall nature of the seals, or as one or more <seal> elements); and <accMat>, for describing and/or transcibing any material not originally part of the manuscript but bound with it or otherwise accompanying it. For example, the small paper slips on which Árni Magnússon frequently noted details about a manuscript or its provenance, which are frequently kept with the manuscript in question, can be described here using one or more paragraphs.

14.3.4 The history of the manuscript

The <history> element contains information on the history of the manuscript. Available within it are three sub-elements: <origin>, for information on when, where and potentially for whom the manuscript was written; <provenance>, in which any evidence of ownership and use is provided, and <acquisition>, which describes when and how the manuscript was acquired by its current owner or holding institution. Each of these elements may contain one or more paragraphs. Alternatively, as with the other major elements in a manuscript description, the <history> element may itself consist of one or more paragraphs in which the history of the manuscript is summarized.

Brief information about the origin of a manuscript is often available in catalogues. In the recommended XML-structure it can be encoded like this:

    <p>Written in <origPlace>Vadstena, Sweden</origPlace> in 
      <origDate notBefore="1526" notAfter="1526">the first half 
        of the 16th century</origDate>.

The usage of elements such as <origPlace> and <origDate> with various suitable attributes is not required but highly recommended as it facilitates searches.

Information regarding the provenance and acquisition of a manuscript are frequenlty more difficult to obtain, if at all available. In manuscripts from the Arnamagnæan Collection, Árni Magnússon quite frequently left note slips containing such detail. While the text of his notes would usually be transcribed in the element <accMat> described above, it can be discussed and interpreted in the elements <provanance> and/or <acquisition>. The same applies for other additions and marginalia (noted and potentially transcribed in <additions>) that may contain hints at the provenance or general history of a manuscript.

14.3.5 Other information

The final large grouping element in a manuscript description is the alreday mentioned element <additional>. The first subsection of this element is called <adminInfo>, which, as its name suggests, contains information pertaining to the curation and management of the manuscript. Such information would not normally form part of the introduction to a scholarly edition, but there is no reason why it could not be included in the document header. Sub-elements available here include <custodialHist>, in which information can be given on such matters as conservation, loans and exhibitions and so on, either as a series of paragraphs or one or more dated <custEvent> elements, and the standard TEI element <availability>, for information on the availability of the manuscript, for example any restrictions on its use or access etc.

Also available within <additional> is a <surrogates> element for information on photographic reproductions. Here it is possible to provide information on, and links to, any digital reproductions of the manuscript that may be available.

Finally, the element <listBibl> is available within <additional> for bibliographical information regading the manuscript as a whole. Inside <listBibl>, one ore more <bibl> elements are used for each bibliographic reference. References to editions or other bibliographical information relevant to individual text-items, on the other hand, should rather be given under the appropriate <msItem> (see ch. 14.3.2 above).

14.3.6 Mark-up for names of persons, places and institutions as well as bibliographical references

Most of the elements that have been mentioned so far have the character of predefined boxes into which information of a certain type can be fitted. But it will be noted in the examples cited that there are other kinds of elements which can appear anywhere within the document, so-called “phrase-level elements”, of which there is a large number available in the TEI standard. These are primarily used in order to facilitate certain types of processing and/or for search purposes. All names (both in the header and the text), for example, can be tagged using the <name> element, which may have a @type attribute to indicate whether they are the names of persons, places or organisations (such as religious orders). If desired, more information about persons can be provided by means of linking to a detailed description by means of the attribute @key. This requires that that particular key has been defined elsewhere, for instance in a <listPerson> element within the header's <ProfileDesc> (see ch. 14.5 below). <listPerson> can contain one or more <person> elements, which are identified by a unique value of the @xml:id attribute. The same value is then given in the @key attribute of any <name> element that is pointing to the longer description. The contents of the individual <person> elements in <listPerson> provide information on birth, death, residence, occupation and so on, either as one or more paragraphs of running prose, or through the use of specialised sub-elements. There are also attributes available to indicate the gender and role of a person.

A detailed description for a person could for example look like this:

  <person xml:id="ThorJon" sex="1" role="owner">
    <persName xml:lang="is">
    <addName type="patronym">Jónsson</addName>
    <birth notBefore="1672-01-01" notAfter="1672-12-31">1672</birth>
    <death when="1720-08-21">21 August 1720</death>
          <settlement type="farm">Staðastaður</settlement>
          <region type="parish">Staðarsveit</region>
          <region type="county">Snæfellsnessýsla</region>
          <region type="compass">Western</region>
          <country key="IS">Iceland</country>
  <!-- more <person> elements could follow -->

The value '1' of @sex stands for “male”, while '2' indicates “female”. Once such a detailed description of a name is available – either wihtin the same XML document or in an external authority file to which the processor has access – naming elements anywhere in the document, i.e. both in the header and the text, can point to it:

<p> some text here...
    <name type="person" key="ThorJon">Þórður Jónsson</name>
    ... </p>

Treating names in this way means that each person is uniquely identified with an ID, to which all individual instances of that person’s name then refer, whatever form those instances take. This solves problems not only of variant spellings but also where, for example, a medieval author is known by a Latin name and any number of vernacular forms, many or all of which may have claims to “authenticity”. In order to ensure uniformity, the method generally employed in the library world has been to accept the form found in some authority file, for example that of the American Library of Congress, as the “base” or “neutral” form. Feelings can run high on this matter, however, and people are frequently reluctant to accept as “neutral” an overtly “foreign” form of the name of some local saint or hero. Within the <person> tag any number of variant forms of a name can be given in multiple <persName> elements. These can be specified as being for a certain language using @xml:lang attributes, but otherwise no prioritisation, and hence, less likelihood of offense. The chief advantage of treating person names in this way, however, is for searching, in particular once one has put together a large body of material. It is possible not only to search for persons with a particular name, but also born in a particular place at a particular time. The <person> elements taken as a whole can also function as a reference tool, a veritable Who’s Who in medieval and early-modern Scandinavia. The possibilities as regards scribes are especially exciting, as it would be a relatively easy matter to add images to the <person> elements showing the hand or hands of each scribe, making it possible eventually to produce a register of all known scribes, searchable in terms of date, location etc.

It is possible to treat bibliographical references in a similar way. Since many of the same works are likely to be referred to again and again it would seem most sensible to provide full bibliographical information only once, in a separate bibliography, to which all bibliographical references in the individual records could then point.

The following is a simple bibliographical record as typically found either in the header or in a separate bibliographical authority file:

<biblStruct xml:id="StudIsl24">
    <addName type="patronym">Halldórsson</addName>
    <title level="m">Helgafellsbækur fornar</title>
    <title level="s">Studia Islandica</title>
      <biblScope unit="vol">XXIV</biblScope>
      <date when="1966">1966</date>

Again, using the unique value of @xml:id, a <bibl> element anywhere in a XML file (that has access to the detailed record) can refer to that information using <ref> as follows:

<bibl><ref target="StudIsl24">Ólafur Halldórsson 1966</ref>, 
     pp. 18 and 22</bibl>

14.4 The encoding description

The <encodingDesc> should document the relationship between the electronic edition and the source it is based upon. It is an optional part of the header, but we recommend that it contains information on the standard of encoding and level of quality. It should have two sub-elements: a <projectDesc> and an <editorialDecl>.

The <projectDesc> can be used to specify in prose the standard of the encoding, e.g. “This text has been encoded according to the standard set out in The Menota handbook, version 3.0, at http://www.menota.org/guidelines”.

The <editorialDecl> uses the <correction> element with the @status attribute to specify the level of quality control. Attribute values (according to TEI) are “high”, “medium”, “low”, “unknown”. The TEI P5 Guidelines (ch. 2.3.3 “The Editorial Practices Declaration”) have these definitions for the possible values:

high: the text has been thoroughly checked and proofread
medium: the text has been checked at least once
low: the text has not been checked
unknown: the correction status of the text is unknown

Once the @status attribute is given a value, the <correction> element may be empty. However, if desired, further specification can be given in prose within a <p> element.

Next within the <editorialDecl>element, a <normalization> element with a menota-specific @me:level attribute is used to specify the level on which the text is encoded. The prototypical levels are “facs”, “dipl” and “norm”, but other levels can also be used in the transcription and should thus be specified, e.g. a “pal” level. (See ch. 3.2 and ch. 3.4 for a discussion of these levels.) Also here, a description in prose may be added in a <p> element. Note that more than one level may be specified, simply separating the values by whitespaces:

  <normalization me:level="facs dipl norm">
    <p>This text has been encoded on three levels: facsimile, diplomatic 
    	 and normalised.</p>

Finally within the <editorialDecl> element, an <interpretation> element is used to specify the amount of lexical and grammatical information in the encoded text. We suggest two attributes, @me:lemmatized and @me:morphAnalyzed, both with the values “completely”, “partly” and “none”. A lemmatised text will have lemmata (i.e. dictionary entries) added in the @lemma attribute of the <w> element, while a morphologically analysed text will have grammatical forms specified in the @me:msa of the same element. See ch. 4.3 for a general overview and ch. 9 for details on this lexical and morphological encoding. A description in prose may be added in a <p> element.

A complete <encodingDesc> may look like this:

      <p>This text has been encoded according to the standard set out in
         <title>The Menota handbook</title>, version 3.0,
          at http://www.menota.org/guidelines.</p>
    <correction status="high">
       <p>This text was proofread by Magnus Rindal and colleagues
          before the publication of the printed version in 1981. It is
          unlikely that it contains any significant number of errors.
          However, it can not be ruled out that the subsequent conversion
          of the file may have introduced some systemic errors.
    <normalization me:level="dipl">
       <p>This text has been encoded on a diplomatic level, according
          to the editorial practice by Norsk Historisk
    <interpretation me:lemmatized="completely" me:morphAnalyzed="completely">
       <p>The complete text has been lemmatised and morphologically
          analysed according to the rules specified in ch. 9 of the
          Menota Handbook, v. 3.0.
    </interpretation >

14.5 The profile description

The <profileDesc> is an optional part of the header. We recommend that the language(s) used in the source are listed here within the element <langUsage>. <langUsage> contains one or more <language> elements with a @ident attribute each. The value of @ident should be a three-letter code, where possible based on the international standard ISO 639-2.

ISO 639-2 contains a list of three-letter abbreviations of language names. In addition to codes for the modern languages, such as “dan” (Danish), “ice” or “isl” (Icelandic), “nor” (Norwegian) and “swe” (Swedish), it lists languages like Latin (“lat”) and Ancient Greek (“grc”). For Medieval Nordic, there is only one abbreviation: “non” for Old Norse, i.e. Old Icelandic and/or Old Norwegian. Since Old Norse is a problematic term and the abbreviation “non” is idiosyncratic, we recommend the values “oda” (Old Danish), “oic” (Old Icelandic), “onw” (Old Norwegian) and “osw” (Old Swedish). In cases of uncertainty, a hyphen may be used, e.g. “oic-onw” for a manuscript which is either Old Iceland or Old Norwegian (but most probably Old Icelandic), “onw-oic” the other way round, etc. Please note that this usage is not ISO conformant.

Additionally, the <profileDesc> may be used to identify and give more detailed information on, for instance, persons or scribal hands that are referred to elsewhere in the document. In ch. 14.3.6 we have already encountered an example of how this could be useful for referring to a person, using <listPerson> and <person>. Similarly, different hands in the source can be given IDs, so they may be referred to elsewehere in the document, most notably in the transcription to mark a change of hands.

A <profileDesc> may look like this:

    <language ident="oic">Old Icelandic</language>
    <language ident="onw">Old Norwegian</language>
    <language ident="osw">Old Swedish</language>
    <language ident="oda">Old Danish</language>
    <language ident="oic-onw">Old Icelandic with Old Norwegian
    <language ident="onw-oic">Old Norwegian with Old Icelandic
    <language ident="lat">Latin</language>
    <language ident="grc">Ancient Greek</language>
    <handNote xml:id="h1"/>
    <handNote xml:id="h2"/>

14.6 The revision description

Even if this is an optional part of the header, it is essential that all changes to the file are recorded. Each change is described within a separate <change> element. Within it, the <date> is given first, then the <name> of the revisor (preferably with affiliation), and, finally, a description in prose of the actual change.

A short series of <change> elements may look like this:

      <persName>Odd Einar Haugen</persName>
      <orgName type="affiliation">University of Bergen</orgName>
    </name>: Revised the encoding in accordance with
           v. 3.0 of the Menota handbook.
      <persName>Tone Merete Bruvik</persName>
      <orgName type="affiliation">Aksis</orgName>
    </name>: Revised the transcription in accordance with
           v. 2.0 of the Menota handbook.

14.7 Example headers

Examples of Menota headers can be accessed in Appendix E. There are examples of both minimal and longer, more detailed headers.

The first two examples are for minimal headers: One header is for a single-text source, here Holm perg 6 fol (Barlaams ok Josaphats saga) while the other is for a multi-text source, here AM 242 fol (Codex Wormianus). Please note that these two examples show rather minimalistic headers with little information on, for instance, the manuscript source. The third example is more detailed and includes information on material features of the manuscripts and its provenance.

First published 28 August 2016. Last updated 10 September 2017. Webmaster.