Ch. 14. The header
Version 3.0 (final publication expected in November 2019)
by Matthew Driscoll, Beeke Stegmann and Odd Einar Haugen
This chapter deals with the first major part of any Menota XML file, the header. The header should describe the file so that meta level information about the text itself, its source, its encoding and its revisions are sufficiently documented. The header has four major parts:
|<fileDesc>||A file description|
|<encodingDesc>||An encoding description|
|<profileDesc>||A text profile|
|<revisionDesc>||A revision history|
This chapter will discuss the recommended (minimal) amount of information for each of the four parts. Due to practical considerations, the <fileDesc> is treated in two separate subsections, one dealing with the meta-level information on the file history, the other concerning the source from which the XML-file is created, i.e. in our case usually the manuscript.
14.2 The file description: Title, edition, extent and publication
The file description is a mandatory part of the header and must include information on the title, on the publication and on the source, cf. ch. 2.2 “The File Description” of the TEI P5 Guidelines. It contains a number of elements, several of which were discussed in ch. 10 above (<name>, <persName>, <forename>, <surname> and <addName>).
The first part of the <fileDesc> supplies the necessary bibliographical information about the digital text in respective title, edition, extent and publication statements. The main editor(s) should always be identified, but since many digital editions are the result of a teamwork (or of an accumulative work) all major contributors should be listed. Please note: The description of the source for transcription (done in the element <sourceDesc>) is also part of the <fileDesc>. However, in order to improve legibility, that part is treated in a separate subchapter (see ch. 14.3 below).
14.2.1 Title statement
|Elements & attributes||Obl/Fac||Explanation|
|<titleStmt>||Information on the title, editor and other people who have been responsible for the edition|
|<title>||The title of the work|
|<editor>||The name of the (main) editor of the encoded work|
|@role||Fac||The role of an editor, in particular in case the editor is an institution or project|
|<orgName>||The name of an organisation|
|@type||Obl||Specifies that the organisation is the institution with which the editor is affiliated, using the the value ‘affiliation’|
|<respStmt>||A statment of responsibility|
|<resp>||Type of responsibility, e.g. transcription, conversion, proof-reading|
In the <titleStmt>, the <title> element gives the title of the document. In the case of a digital transcription, it should specify the primary source (manuscript) on which the transcription is based, and, where applicable, the title of the work transcribed. We recommend that the title also states that the present text is an electronic edition. In single-text manuscripts, the title may look like this:
<title>Barlaams ok Josaphats saga : Holm perg 6 fol : a digital edition</title>
In multi-text manuscripts, the title will be somewhat longer, since each text should be listed (unless there are too many of them):
<title>Snorra-Edda, the four grammatical treatises, Rígsþula, Maríukvæði, and ókennd heiti : AM 242 fol (Codex Wormianus) : a digital edition</title>
If the text is a fragment of a manuscript, this should be stated in the title:
<title>A fragment of Konungs skuggsjá : NKS 235 g 4to : a digital edition</title>
The full list of titles for the works treated in the document will be given in the <sourceDesc> (see ch. 14.3 below), meaning that it is not necessary to include all details in the document title. In general, however, manuscripts are referred to based on the principles described in chapter ch. 15.4.4.
In addition to the title, the <titleStmt> must also list the editor(s) and other contributors to the edition. We recommend that one or more people (or institutions) are identified as the main editor(s) of the text in the <editor> element. In this case, there is a single editor:
<editor> <name> <persName> <forename>Magnus</forename> <surname>Rindal</surname> </persName> <orgName type="affiliation">University of Bergen</orgName> </name> </editor>
Note that while the more detailed encoding of names using <persName> and its specialised subelements as described in ch. 10 may be employed in the header, they need not be used in full. It is acceptable to simply use the <name>element in the header:
<editor> <name type="person">Magnus Rindal</name> <orgName type="affiliation">University of Bergen</orgName> </editor>
See moreover section ch. 14.3.6 below.
Multiple main editors should either be listed either in alphabetical order, if equally responsible, or in order of importance concerning the editing work.
Institutions as well as individuals may be given as editors. If an institution is regarded as the editor, it should be specified by the attribute @role:
<editor role="institution"> <orgName>Språksamlingane</orgName> </editor>
The institution “Språksamlingane” is now located at the University of Bergen, but was until 2016 part of the University of Oslo under the name of Gammelnorsk Ordboksverk. To clarify the location of the institution it can be useful to add this information in an additional <orgName> element:
<editor role="institution"> <orgName>Språksamlingane</orgName> <orgName> <name type="city">Bergen</name> <name type="role">University</name> </orgName> </editor>
For an individual editor, the @role with the value ‘person’ is optional, but may be inserted for clarity. If an <editor> element does not have any @role attribute, it is assumed to describe a person.
If other people than the main editor(s) contributed to the work, this is specified in one or more responsibility statements (<respStmt>). This also applies to cases where an institution is regarded as an the editor. In order to clarify the divison of work and responsibility, the work by the main editor(s) should also be specified in one or more responsibility statements.
<editor> <name type="person">Nina Stensaker</name> <orgName type="affiliation">University of Bergen</orgName> </editor> <respStmt> <resp>Transcription</resp> <name type="person">Nina Stensaker</name> <orgName type="affiliation">University of Bergen</orgName> </respStmt> <respStmt> <resp>Conversion of transcription to XML</resp> <name type="person">Robert K. Paulsen</name> <orgName type="affiliation">University of Bergen</orgName> </respStmt> <respStmt> <resp>Project overview</resp> <name type="person">Odd Einar Haugen</name> <orgName type="affiliation">University of Bergen</orgName> </respStmt>
The TEI P5 Guidelines also recommend that the element <author> is included in the <titleStmt> (ch. 2.2.1 “The Title Statement”). Since almost all Medieval Nordic texts are anonymous we believe this element is not required.
14.2.2 Edition statement
|Elements & attributes||Obl/Fac||Explanation|
|<editionStmt>||A statement of the edition|
|<edition>||A description of the edition (i.e. version), typically by means of a number|
|@n||Obl||The number of the edition|
The <editionStmt> should be used to specify whether the present text is a new or a revised edition of the electronic text as described in the title statement above. Here, “edition” is to be understood as “version”. The version number should be given in the @n attribute with the usual number system, i.e. 1.0, 1.0.1, 1.1, 1.2, etc., while the date of the version should be given in the format year-month-day in the attribute @when, e.g. ‘2004-02-01’.
A complete edition statement may be as simple as this:
<editionStmt> <edition n="1.0">First draft, <date when="2014-02-01"> 1 February 2014</date>.</edition> </editionStmt>
|Elements & attributes||Obl/Fac||Explanation|
|<extent>||The size of the file, preferably specified in words|
|@n||Obl||The number of words (or any other measure)|
The <extent> element specifies the size of the file. The exact number of words should be given in the @n attribute as well as in plain text within the element, e.g.:
<extent n="76411">76411 words</extent>
14.2.4 Publication statement
|Elements & attributes||Obl/Fac||Explanation|
|<publicationStmt>||A statement of the publication|
|<distributor>||A reference to the distributor, e.g. Medieval Nordic Text Archive|
|<idno>||A reference (identification number), e.g. ‘Ms. 1’|
|@type||Fac||The type of reference, e.g. ‘Menota’|
|<date>||The date for the publication of the edition|
|@when||Obl||The date in the year-month-day format, e.g. 2017-03-08|
|<availability>||A description of the conditions for the distribution and use of the text|
|@status||Obl||The type of availability, typically with the values ‘free’, ‘restricted’ or ‘unknown’.|
The <publisher> element specifies the body (publisher, archive) which has made the text available, e.g. the Medieval Nordic Text Archive (Menota).
The <idno> is a unique identification of the text. For texts in the Menota archive the attribute value will be Menota, and the contents of the element will be an acquisition number, beginning with Ms. 1. Note that this information will be supplied by Menota, if the text is being deposited in this archive.
The <availability> element specifies the accessibility of the text. We recommend adding a @status attribute with one of the three values “free”, “restricted”, “unknown” (cf. ch. 2.2.4 “Publication, Distribution, etc.” of the TEI P5 Guidelines).
Further specifications can be added in a <p> element. Almost all texts in the Menota archive are now available under an open CC license, and this should be stated in a <license> element with link to the Creative Commons website. Details on the transferral of this license should be added in a <p> element. A complete publication statement may thus look like this:
<publicationStmt> <distributor>Medieval Nordic Text Archive</distributor> <idno type="Menota">Ms. 2</idno> <date when="2006-12-19">19 December 2006</date> <availability status="free"> <licence target="http://creativecommons.org/licenses/by-sa/4.0/">CC-BY-SA 4.0</licence> <p>Licence accepted by Karl G. Johansson in a mail to Odd Einar Haugen 2 November 2015.</p> </availability> </publicationStmt>
14.3 Source description: Manuscript description
The <sourceDesc> is a mandatory part of the header and describes the source material (cf. ch. 2.2.7 “The Source Description” of the TEI P5 Guidelines). It is a child of <fileDesc>, and in the case of a digital edition, the source is the manuscript carrying the transcribed text. Therefore, the source is usually described using the element <msDesc> (manuscript description), which is placed within <sourceDesc>. With TEI P5, this part of the header includes specific elements for manuscript description, based chiefly on the work of the EU-funded MASTER project (1999-2001) and the TEI Medieval Manuscripts Description Work Group (1998-2000). For detailed information on the manuscript description module, see ch. 10 “Manuscript Description” of the TEI P5 Guidelines.
The <msDesc> element is the framing element into which the manuscript description is put. The description needs not consist of more than the basic information necessary to identify the source, i.e. its location, both geographical and institutional, and its shelfmark or other identifying number or name (e.g. Oslo, Universitetsbibliotek, UB 1042 8vo). However, it is also possible to provide a detailed description of the source, analogous to what one would find in a thorough catalogue record or in the introduction to a scholarly edition. (Note that while the <msDesc> element will normally appear within <sourceDesc> in the document header, it can also appear anywhere within the body of a TEI conformant document, in the same way as the bibliographic elements <bibl>, <biblStruct> and <biblItem>.)
Within <msDesc> the following six elements are available, of which only the first is required:
|Elements & attributes||Explanation|
|<msIdentifier>||Groups information that uniquely identifies the manuscript, i.e. its location, holding institution and shelfmark.|
|<msContents>||Contains an itemised list of the intellectual content of the manuscript or manuscript part, either as a series of paragraphs or as a series of structured manuscript items, possibly including transcriptions of rubrics, incipits, explicits etc., as well as primary bibliographic references.|
|<physDesc>||Groups information concerning all physical aspects of the manuscript or manuscript part, its material, size, format, script, decoration, binding, marginalia etc.|
|<history>||Provides information on the history of the manuscript or manuscript part, its origin, provenance and acquisition by its holding institution.|
|<additional>||Groups other information about the manuscript, in particular, administrative information relating to its availability, custodial history, surrogates etc.|
|<msPart>||Contains in essence a nested <msDesc>, in cases of composite manuscripts now regarded as constituting a single unit but made up of two or more parts which were originally physically distinct; since the contents, physical description and history of the individual parts will normally be quite different, a <msPart> element can contain all the elements listed here, including additional <msPart> elements.|
Within each of these elements a number of sub-elements is available; <msContents>, for example, will normally consist of one or more <msItem> elements, each in turn containing specific elements such as <title> or <locus> (see ch. 14.3.2 below). Technically, the contents need not be this structured, since with all the elements listed above, apart from <msIdentifier>, there is also the option of using ordinary prose, marked up with the <p> element. However, doing so would limit greatly the possibilities both for processing and searching the data. On the other hand, it could be preferable when dealing with pre-existing descriptions (so-called “legacy data”), the exact form of which one may wish, or be required, to maintain. If at all possible, we recommend to use the available specific elements to structure and mark-up the manuscript description.
14.3.1 Manuscript identifier
The only mandatory element within <msDesc> is <msIdentifier>. For <msIdentifier>, a number of sub-elements is available, among others, <country>, <region>, <settlement> (the TEI term for what most people would call city), <institution>, <repository>, <collection> and <idno> (an identifying number, here used for the shelfmark of a manuscript). Although not required, it is strongly recommended that at least the elements <settlement>, <repository> and <idno> are included, since they provide what is, by common consent, the minimum amount of information necessary to identify a manuscript. In many cases, no other elements are needed, as common sense will suffice to distinguish, say, Paris, France from Paris, Texas, as the location of the Bibliothèque Nationale. For search purposes, however, and in order to be precise it is possible to link to external references (see ch. 15).
There are two further sub-elements of <msIdentifyer> available: <altIdentifier>, which contains an alternative, structured identifier of a manuscript, such as a catalogue number or former shelfmark, and <msName>, which contains any form of unstructured alternative name used for a manuscript, such as a nickname. The manuscript Uppsala, Universitetsbibilioteket, DG 1, for example, is far better known under the name Codex Argenteus or the Silver Bible. There are many examples of such nicknames among Nordic manuscripts, for instance in the Arnamagnæan Collection, as Árni Magnússon frequently gave his manuscripts names based on the places where they had been made or where he got them from or the people whom he knew to have possessed them. Occasionally a manuscript can have several such names, in which case multiple <msName> elements are used, or perhaps rather several forms of the name, typically in different languages. The latter can be distinguished from each other by means of the @xml:lang attribute, which is available on all TEI elements.
A <msIdentifier> for a manuscript in the Arnamagnæan Collection may look like this:
<msIdentifier> <country key="DK">Danmark</country> <settlement>København</settlement> <repository>Den Arnamagnæanske Samling</repository> <idno>AM 45 fol</idno> <altIdentifier type="KKKat"> <idno>59</idno> </altIdentifier> <msName type="nickname" xml:lang="la">Codex Frisianus</msName> <msName type="nickname" xml:lang="is">Fríssbók</msName> </msIdentifier>
In addition to the commonly used shelfmark, this example also has an <altIdentifier>, giving the running number from Kålund’s printed catalogue. That this reference is to Kålund’s catalogue can be seen from the @type attribute with the abbreviation “KKKat” as its value. The <msName> elements additionally give two forms of the manuscript’s nickname, one in Latin, one in Icelandic.
14.3.2 Intellectual content
A detailed description of a manuscript’s intellectual contents is put in the <msContents> element, which is the next major sub-element of <msDesc>. <msContents> consists of one or more <msItem> elements, which can be prefaced, if desired, by a <summary> element.
An <msItem> element typically contains at least the elements <locus> and <title> to specify the location in the manuscript and the title of the text in question. More detail can be provided by means of further elements, such as <author>, <incipit>, <explicit>, <rubric>, <finalRubric>, <colophon>, <textLang> and <note>. <msItem> elements are further allowed to “nest”, by which is meant that an <msItem> can contain other <msItem> elements. This is useful where separate items (or subitems) in a manuscript are grouped under a single title or rubric, for example in collections of prayers.
A @defective attribute, with possible values of ‘true’, ‘false’, ‘unknown’ or ‘unspecified’, is available on <msItem>, providing the means of distinguishing between texts which are fragmentary and those which are not. The attribute is also available on the specialised elements <incipit> and <explicit>. When dealing with collections of fragments, each fragment may be given as a separate <msItem> and the first and last words of each transcribed as defective incipits and explicits, as in the following example showing a manuscript with four text fragments of a single work:
<msContents> <msItem n="1" defective="true"> <locus from="1r" to="9v">1r-9v</locus> <title>Knýtlinga saga</title> <msItem n="1.1"> <locus from="1r:1" to="2v:30">1r:1-2v:30</locus> <incipit defective="true">dan<ex>n</ex>a a engl<ex>an</ex>di</incipit> <explicit defective="true">en meðan har<ex>aldr</ex> hein hafði k<ex>onung</ex>r v<ex>er</ex>it yf<ex>ir</ex> danmork</explicit> </msItem> ... <!-- msItems 1.2 to 1.4 --> ... </msItem> </msContents>
The standard TEI element <bibl> (and the grouping parent element <listBibl>) is also available within <msItem>. This should be used to provide bibliographical information on the <msItem> level, i.e. concerning editions of the item in question. Bibliographical information pertaining to the manuscript as a whole should be placed in the <additional> element, described below (see ch. 14.3.5).
14.3.3 Codicological features
The next major element in <msDesc> is <physDesc>, i.e. a physical description. The first element within <physDesc> is <objectDesc>, which relates specifically to the text-bearing object and contains two further sub-elements, <supportDesc> and <layoutDesc>.
<supportDesc> can contain various elements relating to the physical object, or carrier, on which the text is inscribed, such as <support>, describing whether the text is written on parchment, paper etc. and a description thereof; <extent>, the number and size of leaves; <foliation>, how and, if known, when and by whom the manuscript was paginated/foliated; <collation>, a description of the quire structure, any missing leaves and so on; and <condition>, for a description of the current physical state of the manuscript. <layoutDesc> contains one or more <layout> elements, detailing the way(s) in which the text is organised on the page, specifying the number of columns, the dimensions of the written area, the number of lines per page/column etc. The example below shows a simple <objectDesc>. Please not that this <objectDesc> only makes use of selected elements.
<objectDesc form="fragment"> <supportDesc> <support> <p><material>Parchment</material>.</p> </support> <extent>2 leaves; 220 mm (height) by 150 mm (width).</extent> <foliation> <p>The manuscript is foliated: 1-2.</p> </foliation> <collation>One bifolium.</collation> </supportDesc> <layoutDesc> <layout columns="1" writtenLines="23 26"> <p>The text is written in one column of 23-26 lines each; there are approximately 10-12 words per line; the written area measures 205 mm (height) by 130 mm (width).</p> </layout> </layoutDesc> </objectDesc>
The second group of elements within a structured physical description concerns aspects of the writing, illumination or other notation (notably, music) found in a manuscript, including additions made in later hands – the text, as opposed to the carrier. Possible elements are: <handDesc>, usually containing one or more <handNote> elements, <decoDesc>, containing one or more <decoNote> elements, as well as <musicNotation> and <additions>, both containing one or more paragraph elements (<p>).
The <handDesc> element is intended for a description of the scribal hand or hands of the manuscript. This may simply be encoded as one or more <p> elements, but more commonly, the various paragraphs are structured as a series of <handNote> elements, each containing a prose description of one of the hands. The following is an example of a brief <handDesc> element for a manuscript written by a single scribe:
<handDesc hands="1"> <p>Written in <term type="script">Gothic hybrid</term>. The scribe is unknown, but the same hand is found in sections of AM 23 4to and GKS 25 fol.</p> </handDesc>
Please note: The use of the element <term> with its attribute @type as in the example is optional. Such encoding allows for more precise searching than would be possible with free text, but is obviously dependent on there being a commonly agreed taxonomy.
The next example (below) is taken from a manuscript written by six different scribes, where a <handNote> element is used to describe each individual hand. The @script attribute on <handNote> is employed to indicate the type of script. Note also the @scope attribute, with the possible values of ‘major’, ‘minor’ and ‘sole’. Finally, the <locus> element, which we have already encountered in <msItem>, is used to indicate specifically which parts of a manuscript are written in a given hand.
<handDesc hands="6"> <handNote script="Hybrida" scope="major"> <p>The main hand (Hand 1) wrote <locus>ff. 1r-9r and 16r-118v</locus> in a practised Gothic hybrid.</p> </handNote> .... <!-- 5 more handNote elements follow here --> </handDesc>
As each <handNote> element may contain one or more paragraphs, there is no limit to the amount of information which may be given on any single hand. Thus, a detailed analysis of palaeographical and orthographical features (“/a/ is of the two-storey kind” etc.) is perfectly possible within this overall structure if desired.
There is a corresponding element, <decoDesc> for the description of illumination and other decorational features in the manuscript. <decoDesc>, like <handDesc>, may simply contain one or more paragraphs, but ideally, it consists of a sequence of topically organised sub-elements, called <decoNote>, each describing either a decorative component of a manuscript (e.g. a single illuminated initial) or a homogenous class of such components (e.g. sentence initials in general). The example blow shows the usage of multiple <decoNote> elements in <decoDesc>. For detailed step-by-step guidelines on the description of illuminations using <decoNote> see ch. 7.2.
<decoDesc> <decoNote> <p>F. <locus from="1v" to="1v">1v</locus>: full-page illumination, Holy King <name type="person">Óláfr</name>. Caption: <q>Olafur · Haraldz son · Noreks kongur</q>. Colours: yellow, dark green, light red (vermillion), outline: dark brown.</p> </decoNote> <decoNote> <p>F. <locus from="3r" to="3r">3r:24-36</locus>: major pen flourished initial L. Colours: blue, light red.</p> </decoNote> <decoNote> <p>F. <locus from="7r" to="7r">7r</locus>: bas-de-page, two dogs running, a lamb and a bird. Colours: black, dark red, white(?).</p> </decoNote> ... <decoNote> <p>Ff. <locus from="1r" to="27v">1r-27v</locus>: sentence initials were added later, potentially by <name type="person">Jón Erlendsson</name>.</p> </decoNote> ... </decoDesc>
If a manuscript contains musical notation, the element <musicNotation> may be used to describe it. The form, and possibly location, of such musical notation is specified using one or more paragraphs:
<musicNotation> <p>F. <locus>34v</locus>: Square notation on 4-line red staves.</p> </musicNotation>
Finally, the <additions> element is used to list or describe any marginalia or other additions to the manuscript which may be considered of interest or importance. Such additions can additionally be referred to and discussed elsewhere, for example as part of the <history> element in cases where the marginalia provide evidence of ownership (see ch. sec220.127.116.11). In those cases we recommend to describe (and potentially transcribe) the marginalia in detail in the <additions> element, while merely referring to them in other contexts.
<additions> <p>The manuscript contains the following marginalia: <list> <item>F. <locus>4v</locus>, left margin: <q xml:lang="is">hialmadr <ex>ok</ex> <lb/>brynjadr</q>, in a fifteenth-cenury hand, imitating an addition made to the text by the scribe at this point.</item> <item>F. <locus>5r</locus>, lower margin: <q xml:lang="is"> þ<ex>e</ex>tta þiki m<ex>er</ex> v<ex>er</ex>a gott blek en<ex>n</ex>da kan<ex>n</ex> ek ecki betr sia</q>; fifteenth-century hand, probably the same as that on the previous page.</item> <item>F. <locus>9v</locus>, bottom margin: <q xml:lang="is">þessa bok uilda eg <sic>gæt</sic> lært med<lb/>an Gud gefe myer Gott ad <lb/>læra</q>; seventeenth-century hand.</item> </list> </p> </additions>
It should be noted that the various mechanisms for the transcription of primary sources described elsewhere in this handbook – expansion of abbreviations and so on – may be employed here as well.
The third group of elements in <physDesc> pertains to things such as the binding and other material, that might be attached to the manuscript or stored with it. These elements include <bindingDesc>, containing a description of the state of the present as well as potential former bindings of a manuscript (given either as a series of paragraphs or as one or more distinct <binding> elements); <sealDesc>, which supplies information about the seal(s) attached to a document or charter (again either as paragraphs summarising the overall nature of the seals, or as one or more <seal> elements); and <accMat>, for describing and/or transcribing any material not originally part of the manuscript but bound with it or otherwise accompanying it. For example, the small paper slips on which Árni Magnússon frequently noted details about a manuscript or its provenance, which are frequently kept with the manuscript in question, can be described here using one or more paragraphs.
14.3.4 The history of the manuscript
The <history> element contains information on the history of the manuscript. Available within it are three sub-elements: <origin>, for information on when, where and potentially for whom the manuscript was written; <provenance>, in which any evidence of ownership and use is provided, and <acquisition>, which describes when and how the manuscript was acquired by its current owner or holding institution. Each of these elements may contain one or more paragraphs which may contain more specialized elements. Alternatively, as with the other major elements in a manuscript description, the <history> element may itself consist of one or more paragraphs in which the history of the manuscript is summarized.
Brief information about the origin of a manuscript is often available in catalogues. In the recommended XML-structure it can be encoded like this:
<history> <origin> <p>Written in <origPlace>Vadstena, Sweden</origPlace> in <origDate notBefore="1500" notAfter="1550">the first half of the 16th century</origDate>. </p> </origin> </history>
The usage of elements such as <origPlace> and <origDate> with various suitable attributes is not required but highly recommended as it facilitates searches.
Information regarding the provenance and acquisition of a manuscript are frequently more difficult to obtain, if at all available. In manuscripts from the Arnamagnæan Collection, Árni Magnússon quite frequently left note slips containing such detail. While the text of his notes would usually be transcribed in the element <accMat> described above, the contents can be discussed and interpreted in the elements <provenance> and/or <acquisition>. The same applies for other additions and marginalia (noted and potentially transcribed in <additions>) that may contain hints at the provenance or general history of a manuscript.
14.3.5 Other information
The final large grouping element in a manuscript description is the already mentioned element <additional>. The first subsection of this element is called <adminInfo>, which, as its name suggests, contains information pertaining to the curation and management of the manuscript. Such information would not normally form part of the introduction to a scholarly edition, but may be included in the XML document header. Sub-elements available here include <custodialHist>, in which information can be given on such matters as conservation, loans and exhibitions and so on, either as a series of paragraphs or one or more dated <custEvent> elements, and the standard TEI element <availability>, for information on the availability of the manuscript, for example any restrictions on its use or access etc.
Also available within <additional> is a <surrogates> element for information on photographic reproductions. Here it is possible to provide information on, and links to, any digital reproductions of the manuscript that may be available.
Finally, the element <listBibl> is available within <additional> for bibliographical information regarding the manuscript as a whole. Inside <listBibl>, one ore more <bibl> elements are used for each bibliographic reference. References to editions or other bibliographical information relevant to individual text-items, on the other hand, should rather be given under the appropriate <msItem> (see ch. 14.3.2 above).
14.3.6 Mark-up for names and bibliographical references using authority records
Most of the elements that have been mentioned so far have the character of predefined boxes into which information of a certain type can be fitted. But it will be noted in the examples cited that there are other kinds of elements which can appear anywhere within the document, so-called “phrase-level elements”, of which there is a large number available in the TEI standard. These are primarily used in order to facilitate certain types of processing and/or for search purposes.
Wherever names occur (both in the header and the text), they can be tagged using phrase-level elements such as the <name> element, usually with a @type attribute to indicate whether they are the names of persons, places or organisations (such as religious orders). (For other phrase-level elements designed for names see ch. 10.) If desired, more information about persons or places may also be provided by means of linking to a authoritative, more detailed description using the attribute @key. This requires, however, that that particular key has been defined elsewhere, for instance in a <listPerson> element within the header’s <ProfileDesc> (see ch. 14.5 below). The <listPerson> element can contain one or more <person> elements, which are identified by a unique value of the @xml:id attribute. The same value is then given in the @key attribute of any <name> element that is pointing to the longer description. The contents of the individual <person> elements in <listPerson> provide information on birth, death, residence, occupation and so on, either as one or more paragraphs of running prose, or through the use of specialised sub-elements. There are also attributes available to indicate the gender and role of a person.
A detailed description for a person could for example look like this:
<listPerson> <person xml:id="ThoJon001" sex="1" role="owner"> <persName xml:lang="is"> <forename>Þórður</forename> <addName type="patronym">Jónsson</addName> </persName> <birth notBefore="1672-01-01" notAfter="1672-12-31">1672</birth> <death when="1720-08-21">21 August 1720</death> <residence> <placeName> <settlement type="farm">Staðastaður</settlement> <region type="parish">Staðarsveit</region> <region type="county">Snæfellsnessýsla</region> <region type="compass">Western</region> <country key="IS">Iceland</country> </placeName> </residence> <occupation>Clergyman</occupation> </person> ... <!-- more <person> elements could follow --> ... </listPerson>
Please note that the value of the attribute @sex has in the Old Norse scholarly world traditionally been encoded based on the “Representation of Human Sexes” http://standards.iso.org/ittf/PubliclyAvailableStandards/c036266_ISO_IEC_5218_2004(E_F).zip, in which ‘0’ indicates “unknown”; ‘1’ “male”; ‘2’ “female”; and ‘9’ “not applicable” even though the ISO standard is now widely considered inadequate.
Once a reference description of a name is available – either in the <profileDesc> of the header in the same XML document or in an external authority file to which the processor has access – naming elements anywhere in the document, i.e. both in the header and the text, can point to it:
<p> ... [some text here] <name type="person" key="ThoJon001">Þórður Jónsson</name> [some more text here] ... </p>
Treating names in this way means that each person is uniquely identified with an ID, to which all individual instances of that person’s name then refer, whatever form those instances take. This solves problems not only of variant spellings but also where, for example, a medieval author is known by a Latin name and any number of vernacular forms, many or all of which may have claims to “authenticity”. In order to ensure uniformity, the method generally employed in the library world has been to accept the form found in some authority file, for example that of the American Library of Congress, as the “base” or “neutral” form. Feelings can run high on this matter, however, and people are frequently reluctant to accept as “neutral” an overtly “foreign” form of the name of some local saint or hero. Within the <person> element any number of variant forms of a name can be given in multiple <persName> elements. These can be specified as being for a certain language using @xml:lang attributes, but otherwise no prioritisation, and hence, less likelihood of offense.
The chief advantage of treating names in this way, however, is for searching, in particular once one has put together a large body of material. It is possible not only to search for certain places or persons with a particular name, but also people that were born in a particular place at a particular time. The entirety of all <persons> defined in <listPerson> elements can further function as a reference tool, a veritable Who’s Who in medieval and early-modern Scandinavia. The possibilities in regards to scribes are especially exciting, as it would be a relatively easy matter to add images to the <person> elements showing the hand or hands of each scribe, making it possible eventually to produce a register of all known scribes, searchable in terms of date, location etc.
It is possible to treat bibliographical references in a similar way. Since many of the same works are likely to be referred to again and again it would seem most sensible to provide full bibliographical information only once, in a separate bibliography (usually in an authority file), to which all bibliographical references in the individual records could then point.
The following is a simple bibliographical record as typically found either in the header or in a bibliographical authority file:
<biblStruct xml:id="StudIsl24"> <analytic> <author> <forename>Ólafur</firstname> <addName type="patronym">Halldórsson</addName> </author> <title level="m">Helgafellsbækur fornar</title> <title level="s">Studia Islandica</title> </analytic> <monogr> <imprint> <biblScope unit="vol">XXIV</biblScope> <pubPlace>Reykjavík</pubPlace> <date when="1966">1966</date> </imprint> </monogr> </biblStruct>
Again, using the unique value of @xml:id, a <bibl> element anywhere in a XML file (that has access to the detailed record) can refer to that information using the <ref> element with an identifying @target attribute as follows:
<bibl><ref target="StudIsl24">Ólafur Halldórsson 1966</ref>, pp. 18 and 22</bibl>
14.4 The encoding description
The <encodingDesc> should document the relationship between the electronic edition and the source it is based upon. It is an optional part of the header, but we recommend that it contains information on the standard of encoding and level of quality. It should have two sub-elements: a <projectDesc> and an <editorialDecl>.
The <projectDesc> can be used to specify in prose the standard of the encoding, e.g. “This text has been encoded according to the standard set out in The Menota Handbook, version 3.0, at https://menota.org/handbook.xml”.
The <editorialDecl> uses the <correction> element with the @status attribute to specify the level of quality control. Attribute values (according to TEI) are “high”, “medium”, “low”, and “unknown”. The TEI P5 Guidelines (ch. 2.3.3 “The Editorial Practices Declaration”) have these definitions for the possible values:
- high: the text has been thoroughly checked and proofread
- medium: the text has been checked at least once
- low: the text has not been checked
- unknown: the correction status of the text is unknown
Once the @status attribute is given a value, the <correction> element may be empty. However, if desired, further specification can be given in prose within a <p> element.
Next within the <editorialDecl>element, a <normalization> element with a Menota-specific @me:level attribute is used to specify the level on which the text is encoded. The prototypical levels are “facs”, “dipl” and “norm”, but other levels can also be used in the transcription and should thus be specified, e.g. a “pal” level. (See ch. 4 on levels of text representation.) Also here, a description in prose may be added in a <p> element. Note that more than one level may be specified, simply separating the values by whitespaces:
<editorialDecl> <normalization me:level="facs dipl norm"> <p>This text has been encoded on three levels: facsimile, diplomatic and normalised.</p> </normalization> </editorialDecl>
Finally within the <editorialDecl> element, an <interpretation> element is used to specify the amount of lexical and grammatical information in the encoded text. We suggest two attributes, @me:lemmatized and @me:morphAnalyzed, both with the values “completely”, “partly” and “none”, but an additional description in prose may be added in a <p> element. A lemmatised text will have lemmata (i.e. dictionary entries) added in the @lemma attribute of the <w> element, while a morphologically analysed text will have grammatical forms specified in the @me:msa attribute of the same element. See ch. 5.3 for a general overview and ch. 11 for details on this lexical and morphological encoding.
A complete <encodingDesc> may look like this:
<encodingDesc> <projectDesc> <p>This text has been encoded according to the standard set out in <title>The Menota handbook</title>, version 3.0, at https://menota.org/handbook</p> </projectDesc> <editorialDecl> <correction status="high"> <p>This text was proofread by <name type="person">Magnus Rindal</name> and colleagues before the publication of the printed version in 1981. It is unlikely that it contains any significant number of errors. However, it can not be ruled out that the subsequent conversion of the file may have introduced some systemic errors. </p> </correction> <normalization me:level="dipl"> <p>This text has been encoded on a diplomatic level, according to the editorial practice by Norsk Historisk Kjeldeskrift-Institutt. </p> </normalization> <interpretation me:lemmatized="completely" me:morphAnalyzed="completely"> <p>The complete text has been lemmatised and morphologically analysed according to the rules specified in ch. 11 of the Menota Handbook, v. 3.0. </p> </interpretation > </editorialDecl> </encodingDesc>
14.5 The profile description
The <profileDesc> is an optional part of the header. We recommend that the language(s) used in the source are listed here within the element <langUsage>. <langUsage> contains one or more <language> elements with a @ident attribute each. The value of @ident should be a three-letter code, where possible based on the international standard ISO 639-2.
ISO 639-2 contains a list of three-letter abbreviations of language names. In addition to codes for the modern languages, such as “dan” (Danish), “ice” or “isl” (Icelandic), “nor” (Norwegian) and “swe” (Swedish), it lists languages like Latin (“lat”) and Ancient Greek (“grc”). For Medieval Nordic, there is only one abbreviation: “non” for Old Norse, i.e. Old Icelandic and/or Old Norwegian. Since Old Norse is a problematic term and the abbreviation “non” is idiosyncratic, we recommend using the values “oda” (Old Danish), “oic” (Old Icelandic), “onw” (Old Norwegian) and “osw” (Old Swedish) instead. In cases of uncertainty, a hyphen may be used, e.g. “oic-onw” for a manuscript which is either Old Iceland or Old Norwegian (but most probably Old Icelandic), and “onw-oic” in the opposite case. Please note that this usage is not ISO conformant.
Additionally, the <profileDesc> may be used to identify and give more detailed information on, for instance, persons or scribal hands that are referred to elsewhere in the document. Alternatively, separate authority files are employed, in which information is gathered. Such authority files are usually quite simplified, focusing on the authority list in question. In ch. 14.3.6 we have already encountered an example of how this could be useful for referring to a person by means of a key that points to information stored in <listPerson> and <person>. Similarly, different scribal hands in the source document can be given IDs, so they may be referred to elsewhere in the document, most notably in the transcription to mark a change of hands.
A <profileDesc>may looks like this:
<profileDesc> <langUsage> <language ident="oic">Old Icelandic</language> <language ident="onw">Old Norwegian</language> <language ident="osw">Old Swedish</language> <language ident="oda">Old Danish</language> <language ident="oic-onw">Old Icelandic with Old Norwegian traits</language> <language ident="onw-oic">Old Norwegian with Old Icelandic traits</language> <language ident="lat">Latin</language> <language ident="grc">Ancient Greek</language> </langUsage> <handNotes> <handNote xml:id="h1">Scribe 1 has not been identified.</handNote> <handNote xml:id="h2">Scribe 2 hat not been identified.</handNote> </handNotes> </profileDesc>
Note that the Profile Description may list more languages than actually referred to in the text. (See also ch. 11.7.)
14.6 The revision description
Even if this is an optional part of the header, it is essential that all changes to the file are recorded. Each change is described within a separate <change> element. Within it, the <date> is given first, then the <name> of the revisor (preferably with affiliation), and, finally, a description in prose of the actual change.
A short series of <change> elements may look like this:
<revisionDesc> <change> <date>2017-07-19</date> <name>Odd Einar Haugen</name> <orgName type="affiliation">University of Bergen</orgName>: Revised the encoding in accordance with v. 3.0 of the Menota handbook. </change> <change> <date>2006-04-18</date> <name>Tone Merete Bruvik</name> <orgName type="affiliation">Aksis</orgName>: Revised the transcription in accordance with v. 2.0 of the Menota handbook. </change> </revisionDesc>
14.7 Example headers
Three examples of Menota headers can be accessed in Appendix E.
The first two examples are for full manuscripts: One header is for a single-text source, Holm perg 6 fol (Barlaams ok Josaphats saga), while the other is for a multi-text source, AM 242 fol (Codex Wormianus). The third example is for a manuscript fragment, NKS 235 g 4to (Konungs skuggsjá). This is more detailed and includes information on material features of the manuscripts and its provenance.
Note that almost all texts in the Menota archive can be downloaded as XML files. This means that almost any header in the archive can be inspected, and, if convenient, used as an example.