Ch. 14. The header
Version 3.0 beta
This is a preliminary version which can be changed or updated at any time.
The whole chapter has been revised by Beeke Stegmann.
This chapter deals with the first major part of any Menota XML file, the header. The header should describe the file so that meta level information about the text itself, its source, its encoding and its revisions are sufficiently documented. The header has four major parts:
|Elements / attributes||Contents|
||A file description|
||An encoding description|
||A text profile|
||A revision history|
This chapter will discuss the recommended (minimal) amount of information for each of the four parts. Due to practical considerations, however,
<fileDesc> is treated in two separate chapters, one dealing with the meta-level information on the file history, the other
concerning the source from which the XML-file is created, i.e. in our case usually the manuscript.
14.2 The file description: Title, edition, extent and publication
The file description is a mandatory part of the header and must include information on the title, on the publication and on the source, cf.
ch. 2.2 “The File Description”
of the TEI P5 Guidelines. It contains a number of elements, several of which were discussed in ch. 10
The first part of the
<fileDesc> supplies the necessary bibliographical information about the digital text in respective title, edition, extent and publication statements.
The main editor(s) should always be identified, but since many electronic editions are the result of a teamwork (or of an accumulative work) all major contributors should be listed.
Please note: The description of the source for tanscription (done in the element
<sourceDesc> ) is also part of the
However, in order to improve legibility, that part is treated in a separate subchapter (see ch. 14.3 below).
14.2.1 Title statement
|Elements / attributes||Contents|
||Information on the title, editor and other people who have been responsible for the edition|
||The title of the work|
||The name of the editor of the encoded work|
||The role of an editor, in particular in case the editor is an institution or project|
||The name of an organisation|
||Specifies that the organisation is the institution with which the editor is affiliated|
||A statment of responsibility|
||Type of responsibility, e.g. transcription, conversion, proof-reading|
<titleStmt> , the
<title> element gives the title of the document. In the case of a digital transcription,
it should specify the primary source (manuscript) on which the transcription is based, and, where applicable, the title of the work transcribed.
We recommend that the title also states that the present text is an electronic edition. In single-text manuscripts, the title may look like this:
<title>Holm perg 6 fol : Barlaams ok Josaphats saga : an electronic edition</title>
In multi-text manuscripts, the title will by necessity be somewhat longer:
<title>AM 242 fol (Codex Wormianus) : Snorra-Edda, the four grammatical treatises, Rígsþula, Maríukvæði, and ókennd heiti : an electronic edition</title>
The full list of titles for the works treated in the document will be given in the
<sourceDesc> (see ch. 14.3 below),
meaning that it is not necessary to include all details in the document title.
Since manuscripts sigla are sometimes given according to various standards (for instance is “Holm perg 6 fol” in many contexts referred to as “Sth. perg. 6 fol”.), for Old Norse sources we recommend using the sigla in the index volume of Ordbog over det norrøne prosasprog (also accessible on the ONP's web page at http://onpweb.nfi.sc.ku.dk/mscoll_d.html). The manuscript siglum should always be given in full.
In addition to the title, the
<titleStmt> must also list the editor(s) and other contributors to the edition.
We recommend that one or more people (or institutions) are identified as the main editor(s) of the text in the
<editor> <name> <persName> <forename>Magnus</forename> <surname>Rindal</surname> </persName> <orgName type="affiliation">University of Bergen</orgName> </name> </editor>
Multiple main editors should either be listed either in alphabetical order, if equally responsible, or in order of importance concerning the editing work. Institutions as well as individuals may be given as editors:
<editor role="institution"> <name> <orgName>Gammelnorsk Ordboksverk / Enhet for digital dokumentasjon</orgName> </name> </editor> <editor role="person"> <name> <persName> <forename>Christian-Emil</forename> <surname>Ore</surname> </persName> <orgName type="affiliation">University of Oslo</orgName> </name> </editor>
In this example, the attribute
@role explains the fact that the first editor is an institution rather than an individual.
For the second editor,
@role with the value 'person'
is optional, but may be inserted for clarity.
<editor> element does not have any
it is asumed to describe a single person.
If other people than the main editor(s) contributed to the work, this is specified in one or more responsibility statements (
In most cases it will be preferable to use one
<respStmt> per task or set of tasks (rather than per contributer) and list all the contributers that worked
on this specific task or tasks.
In order to clarify the divison of work and responsibility, the main editors should also be added to the relevant responsibility statements.
<respStmt> <resp>Lemmatisation and morphological encoding</resp> <name> <persName> <forename>Jon Erik</forename> <surname>Hagen</surname> </persName> <orgName type="affiliation">University of Bergen</orgName> </name> </respStmt> <respStmt> <resp>Conversion to Menotic XML</resp> <name> <persName> <forename>Christian Emil</forename> <surname>Ore</surname> </persName> <orgName type="affiliation">University of Oslo</orgName> </name> </respStmt>
If contributors are responsible for several activities, this may be specified in more than one
<resp> -element within this
<respStmt> , listed in chronological order.
Note, however, that if more than one contributer is listed in a single responsibility statement, this will always be interpreted as all listed contributers worked on all the tasks given in the
<respStmt> in question. If this i not the case, multiple responsibility statements should be used.
<respStmt> <resp>Transcription of primary source</resp> <resp>Conversion to XML</resp> <resp>Lemmatisation</resp> <name> <persName> <forename>Karl G.</forename> <surname>Johansson</surname> </persName> <orgName type="affiliation">University of Oslo</orgName> </name> </respStmt>
As discussed in ch. 10 above, patronymica should not be encoded as surnames, but rather as additional names:
<editor> <name> <persName> <forename>Haraldur</forename> <addName type="patronym">Bernharðsson</addName> </persName> <orgName type="affiliation">University of Iceland</orgName> </name> </editor>
When listing persons in alphabetical order, a surname should be given before any forenames, e.g. “Rindal, Magnus”. In the absence of a surname, a forename is given before an additional name, e.g. “Haraldur Bernharðsson”.
The TEI P5 Guidelines also recommend that the element
<author> is included in the
<titleStmt> (ch. 2.2.1 “The Title Statement”).
Since almost all Medieval Nordic texts are anonymous we believe this element is not required.
14.2.2 Edition statement
|Elements / attributes||Contents|
||A statment of the edition|
||A description of the edition (i.e. version), typically by means of a number|
||The number of the edition|
<editionStmt> should be used to specify whether the present text is a new or a revised edition of the
electronic text as described in the title statement above. Here, “edition” is to be understood as “version”.
The version number should be given in the
@n attribute with the usual number
system, i.e. 1.0, 1.0.1, 1.1, 1.2, etc., while the date of the version should be given in the format year-month-day in the attribute
@when, e.g. '2004-02-01'
A complete edition statement may be as simple as this:
<editionStmt> <edition n="1.0">First draft, <date when="2004-02-01"> 1 February 2004</date>.</edition> </editionStmt>
|Elements / attributes||Contents|
||The size of the file, preferably specified in words|
||The number of words (or any other measure)|
<extent> element specifies the size of the file. The exact number of words should be given in the
@n attribute as well as in plain text within the element, e.g.:
<extent n="76411">76411 words</extent>
14.2.4 Publication statement
|Elements / attributes||Contents|
||A statment of the publication|
||A reference to the distributor, e.g. Medieval Nordic Text Archive|
||A reference (identification number), e.g. “Ms. 1”|
||The type of reference, e.g. 'Menota'|
||The date for the publication of the edition|
||The date in the year-month-day format, e.g. 2017-03-08|
||A description of the conditions for the distribution and use of the text|
||The type of availability, typically with the values 'free' , 'restricted' or 'unknown' .|
<publisher> element specifies the body (publisher, archive) which has made the text available, e.g. the Medieval Nordic Text Archive (Menota).
<idno> is a unique identification of the text. For texts in the Menota archive the attribute value will be Menota, and the contents of the element
will be an acquisition number, beginning with Ms. 1. Note that this information will be supplied by Menota, if the text is being deposited in this archive.
<availability> element speficies the accessability of the text. We recommend adding a
attribute with one of the three values “free”, “restricted”, “unknown”
(cf. ch. 2.2.4 “Publication, Distribution, etc.”
of the TEI P5 Guidelines). Further specifications can be added in a
<p> element. Texts in the Menota archive typically a restricted availability, and we recommend
adding this description:
“This text is available for purposes of academic research and teaching only. Re-distribution in any form without prior permission is prohibited.
Short extracts may be cited with full acknowledgment of the source.” A complete publication statement may thus look like this:
<publicationStmt> <distributor>Medieval Nordic text Archive</distributor> <idno type="Menota">Ms. 1</idno> <date when="2004-03-01">1 March 2004</date> <availability status="restricted"> <p>This text is available for purposes of academic research and teaching only. Re-distribution in any form without prior permission is prohibited. Short extracts may be cited with full acknowledgment of the source.</p> </availability> </publicationStmt>
14.3 Source description: Manuscript description
<sourceDesc> is a mandatory part of the header and describes the source material
(cf. ch. 2.2.7 “The Source Description”
of the TEI P5 Guidelines). It is a child of
<fileDesc> , and in the case of a digital edition, the source is the manuscript carrying the transcribed text.
Therefore, the source is described using the element
<msDesc> (manuscript description), which is placed within
With TEI P5, this part of the header includes specific elements for manuscript description, based chiefly on the work of the EU-funded MASTER project (1999-2001)
and the TEI Medieval Manuscripts Description Work Group (1998-2000). For more detailed information on the manuscript description module, see
ch. 10 “Manuscript Description” of the TEI P5 Guidelines.
element is the framing element into which the manuscript description
is put. The description needs not consist of more than the basic
information necessary to identify the source, i.e. its location, both
geographical and institutional, and its shelfmark or other
identifying number or name (e.g. Oslo, Universitetsbibliotek, UB 1042
8vo). However, it is also possible to provide a detailed description of
the source, analagous to what one would find in a thorough catalogue record or in the introduction to a
(Note that while the
will normally appear within
the document header, it can also appear
anywhere within the body of a TEI conformant document, in the same way
as the bibliographic elements
the following seven elements are available, of which only the first is required:
||Groups information that uniquely identifies the manuscript, i.e. its location, holding institution and shelfmark.|
||Contains an itemised list of the intellectual content of the manuscript or manuscript part, either as a series of paragraphs or as a series of structured manuscript items, possibly including transcriptions of rubrics, incipits, explicits etc., as well as primary bibliographic references.|
||Groups information concerning all physical aspects of the manuscript or manuscript part, its material, size, format, script, decoration, binding, marginalia etc.|
||Provides information on the history of the manuscript or manuscript part, its origin, provenance and acquisition by its holding institution.|
||Groups other information about the manuscript, in particular, administrative information relating to its availability, custodial history, surrogates etc.|
in essence a nested
Within each of these elements a
number of sub-elements is available;
<msContents> , for
example, will normally consist of one or more
elements, each in turn containing specific elements such as
<colophon> (see ch. 14.3.2 below). Technically, the contents need
not be this structured, since with all the elements listed above, apart from
<msIdentifier> , there is also the option of using ordinary
prose, marked up with the
<p> element. However, doing so would
limit greatly the possibilities both for processing and searching the
data. On the other hand, it could be preferable when dealing with pre-existing
descriptions (so-called “legacy data”), the exact form of
which one may wish, or be required, to maintain.
If at all possible, we recommend to use the avaialable specific elements to structure and mark-up the manuscript description.
14.3.1 Manuscript identifier
The only mandatory element within
<msIdentifier> . For
<msIdentifier> , a number of
sub-elements is available, among others,
<settlement> (the TEI term for what most people would
<idno> (an identifying number, here used for the shelfmark of a manuscript).
Although not required, it is strongly recommended that at least the elements
<idno> are included, since
they provide what is, by common consent, the minimum amount of
information necessary to identify a manuscript. In many cases, no
other elements are needed, as common sense will suffice to
distinguish, say, Paris, France from Paris, Texas, as the location of
Bibliothèque Nationale. For search purposes,
however, it is probably a good idea to include as much information as
possible, such as
<country> and, where applicable,
There are two further sub-elements of
<altIdentifier> , which contains an alternative, structured
identifier of a manuscript, such as a catalogue number or former shelfmark, and
<msName> , which contains any form of unstructured alternative name
used for a manuscript, such as a nickname. The manuscript Uppsala, Universitetsbibilioteket, DG 1, for
example, is far better known under the name
Codex Argenteus or
Silver Bible. There are many examples of such nicknames
among Nordic manuscripts, for instance in the Arnamagnæan Collection, as
Árni Magnússon frequently gave his manuscripts names
based on the places where they had been made or where he got them
from or the people whom he knew to have produced or possessed them.
Occasionally a manuscript can have several such names, in which case multiple
<msName> elements are used, or perhaps
rather several forms of the name, typically in different languages.
The latter can be distinguished from each other by means of the
@xml:lang attribute, which is
available on all TEI elements.
<msIdentifier> for a manuscript in the Arnamagnæan
Collection looks like this:
<msIdentifier> <country key="DK">Danmark</country> <settlement>København</settlement> <repository>Den Arnamagnæanske Samling</repository> <idno>AM 45 fol.</idno> <altIdentifier type="KKKat"> <idno>59</idno> </altIdentifier> <msName type="nickname" xml:lang="la">Codex Frisianus</msName> <msName type="nickname" xml:lang="is">Fríssbók</msName> </msIdentifier>
The value of the
<country> , which is the standard
international two-letter code, is useful for search purposes, enabling one to find
all manuscripts in Danish repositories regardless of whether the
cataloguer has given the name of the country as “Denmark”
or “Danmark” (or, for that matter, “Dänemark”, “Dinamarca” or “Дания”).
There are many such attributes in the manuscript description tagset
which allow for cross-language searches.
In addition to the commonly used shelfmark, this example also has an
<altIdentifier> , giving the running number from Kålund's printed catalogue.
That this reference is to Kålund's catalogue can be seen from the
@type attribute with the abbraviation “KKKat” as its value.
14.3.2 Intellectual content
Detailed description of a manuscript’s intellectual contents is put in the
<msContents> element, which is the next major sub-element of
<msContents> consists of one or more
<msItem> elements, which can be prefaced, if desired, by a
<msItem> element typically contains at least the elements
<title> to specify the location in the manuscript and the title of the text in question.
More detail can be provided by means of further elements, such as
<msItem> elements are further allowed to “nest”, by which is meant that an
<msItem> can contain other
elements. This is useful where separate items (or subitems) in a manuscript are
grouped under a single title or rubric, for example in collections of
@defective attribute, with
possible values of 'true'
, is available on
<msItem> , providing the means
of distinguishing between texts which are fragmentary
and those which are not. The attribute is also available on the
<explicit> . When dealing with collections of fragments,
each fragment may be given as a separate
the first and last words of each transcribed as defective incipits
and explicits, as in the following example, a manuscript containing
four fragments of a single work:
<msContents> <msItem n="1" defective="true"> <locus from="1r" to="9v">1r-9v</locus> <title>Knýtlinga saga</title> <msItem n="1.1"> <locus from="1r:1" to="2v:30">1r:1-2v:30</locus> <incipit defective="true">dan<ex>n</ex>a a engl<ex>an</ex>di</incipit> <explicit defective="true">en meðan har<ex>aldr</ex> hein hafði k<ex>onung</ex>r v<ex>er</ex>it yf<ex>ir</ex> danmork</explicit> </msItem> ... <!-- msItems 1.2 to 1.4 --> ... </msItem> </msContents>
The standard TEI element
<bibl> (and the grouping parent element
<listBibl> ) is also available within
<msItem> . This should be used to provide bibliographical
information on the
<msItem> level, i.e. concerning
editions of the item in question. Bibliographical information
pertaining to the manuscript as a whole should be placed in the
<additional> element, described below (see ch. 14.3.5).
14.3.3 Codicological features
The following major element in
<physDesc> , i.e.
physical description. The first element within
<objectDesc> , which relates specifically to the text-bearing
object and contains two further sub-elements,
can contain various elements relating to the physical object, or carrier,
on which the text is inscribed, such as
<support> , describing whether the text is written on parchment, paper etc.
and a description thereof;
<extent> , the number and size
<foliation> , how and, if known, when and by whom the
manuscript has been paginated/foliated;
<collation> , a
description of the quire structure, any missing leaves and so on; and
<condition> , for a description of the present physical
state of the manuscript.
<layoutDesc> contains one or more
<layout> elements, detailing the way(s)
in which the text is organised on the page, specifying the number of
columns, the dimensions of the written area, the number of lines per
The example below shows a simple
Please not that this
<objectDesc> only makes use of selected elements.
<objectDesc form="fragment"> <supportDesc> <support> <p><material>Parchment</material>.</p> </support> <extent>2 leaves; 220 mm (height) by 150 mm (width).</extent> <foliation> <p>The manuscript is foliated: 1-2.</p> </foliation> <collation>One bifolium.</collation> </supportDesc> <layoutDesc> <layout columns="1" writtenLines="23 26"> <p>The text is written in one column of 23-26 lines each; there are approxinately 10-12 words per line; the written area measures 205 mm (height) by 130 mm (width).</p> </layout> </layoutDesc> </objectDesc>
The second group of elements within a structured physical description concerns aspects of the writing,
illumination or other notation (notably, music) found in a manuscript, including additions made in later hands – the text,
as opposed to the carrier. Possible elements are:
<handDesc> , usually containing one or more
<decoDesc> , containing one or more
as well as
<additions> , both containing one or more paragraph elements.
<handDesc> element is intended for a description of the
scribal hand or hands of the manuscript. This may simply be encoded as
one or more
<p> elements, but more commonly, the various
paragraphs are structred as a
<handNote> elements, each containing a
prose description of one of the hands. The following is an
example of a brief
<handDesc> element for a manuscript written by a single scribe:
<handDesc hands="1"> <p>Written in <term type="script">Gothic hybrid</term>. The scribe is unknown, but the same hand is found in sections of AM 23 4to and GkS 25 fol.</p> </handDesc>
Please note: The use of the TEI element
<term> with its attribute
@type as in the example is optional.
Such encoding allows for more precise searching than
would be possible with free text, but is obviously dependent on there
being a commonly agreed taxonomy.
The next example (below) is taken from a manuscript written by six different scribes, where a
<handNote> element is used to describe each individual
@script attribute on
employed to indicate the type of script. Note also the
with the possible values of 'major'
element, which we have alrady encountered in
<msItem> , is used to
indicate specifically which parts of a manuscript are written in a
<handDesc hands="6"> <handNote script="Hybrida" scope="major"> <p>The main hand (Hand 1) writes <locus>ff. 1r-9r and 16r-118v</locus> in a practised Gothic hybrid.</p> </handNote> .... <!-- more handNote elements follow here --> </handDesc>
<handNote> element may contain one or more
paragraphs, there is no limit to the amount of information which may
be given on any single hand. Thus, a detailed analysis of
palaeographical and orthographical features (“/a/ is of the
two-storey kind” etc.) is perfectly possible within this
There is a corresponding element,
<decoDesc> for the description of illumination and
other decorational features in the manuscript.
<decoDesc> , like
<handDesc> , may simply
contain one or more paragraphs, but ideally, it consists of a sequence of topically organised
<decoNote> , each describing either
a decorative component of a manuscript (e.g. a single illuminated
initial) or a homogenous class of such components (e.g. sentence
initials in general).
The example blow shows the usage of multiple
<decoDesc> . For detailed step-by-step guidelines on the description of illuminations
<decoNote> see ch. 7.2.
<decoDesc> <decoNote> <p>F. <locus from="1v" to="1v">1v</locus>: full-page illumination, Holy King <name type="person">Óláfr</name>. Caption: <q>Olafur · Haraldz son · Noreks kongur</q>. Colours: yellow, dark green, light red (vermillion), outline: dark brown.</p> </decoNote> <decoNote> <p>F. <locus from="3r" to="3r">3r:24-36</locus>: major pen flourished initial L. Colours: blue, light red.</p> </decoNote> <decoNote> <p>F. <locus from="7r" to="7r">7r</locus>: bas-de-page, two dogs running, a lamb and a bird. Colours: black, dark red, white(?).</p> </decoNote> ... <decoNote> <p>Ff. <locus from="1r" to="27v">1r-27v</locus>: sentence initials were added later, potentially by <name type="person">Jón Jónsson</name></p> </decoNote> ... </decoDesc>
If a manuscript contains musical notation, the element
<musicNotation> may be used to describe it.
The form, and possibly location, of such musical notation is specified using one or more paragraphs:
<musicNotation> <p>Fol. <locus>34v</locus>: Square notation on 4-line red staves.</p> </musicNotation>
<additions> element is used to list or describe any marginalia or other additions to the
manuscript which may be considered of interest or importance.
Such additons can additionally be referred to and discussed elsewhere, for example as part of the
element in cases where the marginalia provide evidence of ownership (see ch. sec188.8.131.52).
In those cases we recommend to describe (and potentially transcribe) the marginalia in detail
while merely refering to them in other contexts.
<additions> <p>The manuscript contains the following marginalia: <list> <item>Fol. <locus>4v</locus>, left margin: <q xml:lang="is">hialmadr <ex>ok</ex> <lb/>brynjadr</q>, in a fifteenth-cenury hand, imitating an addition made to the text by the scribe at this point.</item> <item>Fol. <locus>5r</locus>, lower margin: <q xml:lang="is"> þ<ex>e</ex>tta þiki m<ex>er</ex> v<ex>er</ex>a gott blek en<ex>n</ex>da kan<ex>n</ex> ek ecki betr sia</q>; fifteenth-century hand, probably the same as that on the previous page.</item> <item>Fol. <locus>9v</locus>, bottom margin: <q xml:lang="is">þessa bok uilda eg <sic>gæt</sic> lært med<lb/>an Gud gefe myer Gott ad <lb/>læra</q>; seventeenth-century hand.</item> </list> </p> </additions>
It should be noted that the various mechanisms for the transcription of primary sources described elsewhere in this handbook – expansion of abbreviations and so on – may be employed here as well.
The third group of elements in
<physDesc> pertains to things such as the binding and other material, that might be attached to the manuscript or stored with it.
These elements include
<bindingDesc> , containing a description of the state of the present
as well as potential former bindings of a manuscript (given either as a series of paragraphs or as one or more distinct
<sealDesc> , which supplies information about the seal(s) attached to a document or charter (again either as paragraphs
summarising the overall nature of the seals, or as one or more
<seal> elements); and
<accMat> , for describing and/or
transcibing any material not originally part of the manuscript but bound with it or otherwise accompanying it. For example, the small paper slips
on which Árni Magnússon frequently noted details about a manuscript or its provenance, which are frequently kept with the manuscript in question,
can be described here using one or more paragraphs.
14.3.4 The history of the manuscript
contains information on the history of the manuscript. Available
within it are three sub-elements:
<origin> , for
information on when, where and potentially for whom the manuscript was
<provenance> , in which any evidence of
ownership and use is provided, and
<acquisition> , which
describes when and how the manuscript was acquired by its current owner or holding
institution. Each of these elements may contain one or more paragraphs.
Alternatively, as with the other major elements in a manuscript
<history> element may itself consist
of one or more paragraphs in which the history of the
manuscript is summarized.
Brief information about the origin of a manuscript is often available in catalogues. In the recommended XML-structure it can be encoded like this:
<history> <origin> <p>Written in <origPlace>Vadstena, Sweden</origPlace> in <origDate notBefore="1526" notAfter="1526">the first half of the 16th century</origDate>. </p> </origin> </history>
The usage of elements such as
<origDate> with various suitable attributes is not required
but highly recommended as it facilitates searches.
Information regarding the provenance and acquisition of a manuscript are frequenlty more difficult to obtain, if at all available.
In manuscripts from the Arnamagnæan Collection, Árni Magnússon quite frequently left note slips containing such detail.
While the text of his notes would usually be transcribed in the element
<accMat> described above, it can be discussed and interpreted in the elements
The same applies for other additions and marginalia (noted and potentially transcribed in
<additions> ) that may contain hints at
the provenance or general history of a manuscript.
14.3.5 Other information
The final large grouping element in
a manuscript description is the alreday mentioned element
<additional> . The first subsection of this
element is called
<adminInfo> , which, as its name
suggests, contains information pertaining to the curation and
management of the manuscript. Such information would not normally
form part of the introduction to a scholarly edition, but there is no
reason why it could not be included in the document header.
Sub-elements available here include
<custodialHist> , in
which information can be given on such matters as conservation, loans
and exhibitions and so on, either as a series of paragraphs or one or
<custEvent> elements, and the standard TEI
<availability> , for information on the
availability of the manuscript, for example any restrictions on its
use or access etc.
Also available within
<additional> is a
<surrogates> element for
information on photographic reproductions. Here it is possible
to provide information on, and links to, any digital reproductions of the manuscript
that may be available.
Finally, the element
<listBibl> is available within
for bibliographical information regading the manuscript as a
<listBibl> , one ore more
<bibl> elements are used for each bibliographic reference.
References to editions or other bibliographical information relevant to individual text-items, on the other hand,
should rather be given under the appropriate
<msItem> (see ch. 14.3.2 above).
14.3.6 Mark-up for names of persons, places and institutions as well as bibliographical references
Most of the elements that have been
mentioned so far have the character of predefined boxes into which information
of a certain type can be fitted. But it will be noted in the examples
cited that there are other kinds of elements which can appear
anywhere within the document, so-called “phrase-level
elements”, of which there is a large number available in the
TEI standard. These are primarily used in order to
facilitate certain types of processing and/or for search purposes.
All names (both in the header and the text), for example, can be tagged using the
<name> element, which may have a
@type attribute to indicate
whether they are the names of persons, places or organisations (such
as religious orders). If desired, more information about persons can be
provided by means of linking to a detailed description by means of the attribute
This requires that that particular key has been defined elsewhere, for instance
<listPerson> element within the
<ProfileDesc> (see ch. 14.5 below).
<listPerson> can contain one or more
<person> elements, which are identified by a unique value of the
@xml:id attribute. The
same value is then given in the
@key attribute of any
<name> element that is pointing to the longer description.
The contents of the individual
<person> elements in
provide information on birth, death, residence, occupation and so on, either
as one or more paragraphs of running prose, or through the use of
specialised sub-elements. There are also attributes available to indicate
the gender and role of a person.
A detailed description for a person could for example look like this:
<listPerson> <person xml:id="ThorJon" sex="1" role="owner"> <persName xml:lang="is"> <forename>Þórður</forename> <addName type="patronym">Jónsson</addName> </persName> <birth notBefore="1672-01-01" notAfter="1672-12-31">1672</birth> <death when="1720-08-21">21 August 1720</death> <residence> <placeName> <settlement type="farm">Staðastaður</settlement> <region type="parish">Staðarsveit</region> <region type="county">Snæfellsnessýsla</region> <region type="compass">Western</region> <country key="IS">Iceland</country> </placeName> </residence> <occupation>Clergyman</occupation> </person> ... <!-- more <person> elements could follow --> ... </listPerson>
The value '1'
@sex stands for “male”, while '2'
Once such a detailed description of a name is available – either wihtin the same XML document or in an external authority file to which the processor has access –
naming elements anywhere in the document, i.e. both in the header and the text, can point to it:
<p> some text here... <name type="person" key="ThorJon">Þórður Jónsson</name> ... </p>
Treating names in this way means
that each person is uniquely identified with an ID, to which all
individual instances of that person’s name then refer, whatever
form those instances take. This solves problems not only of
variant spellings but also where, for example, a medieval author is
known by a Latin name and any number of vernacular forms, many or all
of which may have claims to “authenticity”. In order to
ensure uniformity, the method generally employed in the library world
has been to accept the form found in some authority file, for example
that of the American Library of Congress, as the “base” or
“neutral” form. Feelings can run high on this matter,
however, and people are frequently reluctant to accept as
“neutral” an overtly “foreign” form of the name
of some local saint or hero. Within the
<person> tag any
number of variant forms of a name can be given in multiple
These can be specified as being for a certain language using
but otherwise no
prioritisation, and hence, less likelihood of offense. The chief
advantage of treating person names in this way, however, is for searching,
in particular once one has put together a large body of material. It
is possible not only to search for persons with a particular name,
but also born in a particular place at a particular time. The
<person> elements taken as a whole can also function as
a reference tool, a veritable Who’s Who in medieval and
early-modern Scandinavia. The possibilities as regards scribes are
especially exciting, as it would be a relatively easy matter to add
images to the
<person> elements showing the hand or
hands of each scribe, making it possible eventually to produce a
register of all known scribes, searchable in terms of date, location
It is possible to treat bibliographical references in a similar way. Since many of the same works are likely to be referred to again and again it would seem most sensible to provide full bibliographical information only once, in a separate bibliography, to which all bibliographical references in the individual records could then point.
The following is a simple bibliographical record as typically found either in the header or in a separate bibliographical authority file:
<biblStruct xml:id="StudIsl24"> <analytic> <author> <forename>Ólafur</firstname> <addName type="patronym">Halldórsson</addName> </author> <title level="m">Helgafellsbækur fornar</title> <title level="s">Studia Islandica</title> </analytic> <monogr> <imprint> <biblScope unit="vol">XXIV</biblScope> <pubPlace>Reykjavík</pubPlace> <date when="1966">1966</date> </imprint> </monogr> </biblStruct>
Again, using the unique value of
<bibl> element anywhere in a XML file (that has access to the detailed record)
can refer to that information using
<ref> as follows:
<bibl><ref target="StudIsl24">Ólafur Halldórsson 1966</ref>, pp. 18 and 22</bibl>
14.4 The encoding description
<encodingDesc> should document the relationship between the electronic edition and the source it is based upon.
It is an optional part of the header, but we recommend that it contains information on the standard of encoding and level of quality.
It should have two sub-elements: a
<projectDesc> and an
<projectDesc> can be used to specify in prose the standard of the encoding, e.g.
“This text has been encoded according to the standard set out in
The Menota handbook, version 3.0, at http://www.menota.org/guidelines”.
<editorialDecl> uses the
<correction> element with the
@status attribute to specify the level of quality control.
Attribute values (according to TEI) are “high”, “medium”, “low”, “unknown”.
The TEI P5 Guidelines (ch. 2.3.3 “The Editorial Practices Declaration”)
have these definitions for the possible values:
- high: the text has been thoroughly checked and proofread
- medium: the text has been checked at least once
- low: the text has not been checked
- unknown: the correction status of the text is unknown
@status attribute is given a value, the
<correction> element may be empty.
However, if desired, further specification can be given in prose within a
Next within the
<editorialDecl> element, a
<normalization> element with a menota-specific
is used to specify the level on which the text is encoded. The prototypical levels are “facs”, “dipl” and “norm”,
but other levels can also be used in the transcription and should thus be specified, e.g. a “pal” level.
(See ch. 3.2 and ch. 3.4 for a discussion of these levels.)
Also here, a description in prose may be added in a
Note that more than one level may be specified, simply separating the values by whitespaces:
<editorialDecl> <normalization me:level="facs dipl norm"> <p>This text has been encoded on three levels: facsimile, diplomatic and normalised.</p> </normalization> </editorialDecl>
Finally within the
<editorialDecl> element, an
element is used to specify the amount of lexical and grammatical information in the encoded text.
We suggest two attributes,
@me:morphAnalyzed, both with the values “completely”,
“partly” and “none”. A lemmatised text will have lemmata (i.e. dictionary entries) added in the
@lemma attribute of the
<w> element, while a morphologically analysed text will have grammatical forms specified in the
@me:msa of the same element.
See ch. 4.3 for a general overview and ch. 9
for details on this lexical and morphological encoding. A description in prose may be added in a
<encodingDesc> may look like this:
<encodingDesc> <projectDesc> <p>This text has been encoded according to the standard set out in <title>The Menota handbook</title>, version 3.0, at http://www.menota.org/guidelines.</p> </projectDesc> <editorialDecl> <correction status="high"> <p>This text was proofread by Magnus Rindal and colleagues before the publication of the printed version in 1981. It is unlikely that it contains any significant number of errors. However, it can not be ruled out that the subsequent conversion of the file may have introduced some systemic errors. </p> </correction> <normalization me:level="dipl"> <p>This text has been encoded on a diplomatic level, according to the editorial practice by Norsk Historisk Kjeldeskrift-Institutt. </p> </normalization> <interpretation me:lemmatized="completely" me:morphAnalyzed="completely"> <p>The complete text has been lemmatised and morphologically analysed according to the rules specified in ch. 9 of the Menota Handbook, v. 3.0. </p> </interpretation > </editorialDecl> </encodingDesc>
14.5 The profile description
<profileDesc> is an optional part of the header. We recommend that
the language(s) used in the source are listed here within the element
<langUsage> contains one or more
<language> elements with a
@ident attribute each. The value of
should be a three-letter code, where possible based on the international standard ISO 639-2.
ISO 639-2 contains a list of three-letter abbreviations of language names. In addition to codes for the modern languages, such as “dan” (Danish), “ice” or “isl” (Icelandic), “nor” (Norwegian) and “swe” (Swedish), it lists languages like Latin (“lat”) and Ancient Greek (“grc”). For Medieval Nordic, there is only one abbreviation: “non” for Old Norse, i.e. Old Icelandic and/or Old Norwegian. Since Old Norse is a problematic term and the abbreviation “non” is idiosyncratic, we recommend the values “oda” (Old Danish), “oic” (Old Icelandic), “onw” (Old Norwegian) and “osw” (Old Swedish). In cases of uncertainty, a hyphen may be used, e.g. “oic-onw” for a manuscript which is either Old Iceland or Old Norwegian (but most probably Old Icelandic), “onw-oic” the other way round, etc. Please note that this usage is not ISO conformant.
<profileDesc> may be used to identify and give more detailed information on, for instance, persons or scribal hands
that are referred to elsewhere in the document.
In ch. 14.3.6 we have already encountered an example of how this could be useful for
referring to a person, using
Similarly, different hands in the source can be given IDs, so they may be referred to elsewehere in the document, most notably in the transcription to mark a change of hands.
<profileDesc> may look like this:
<profileDesc> <langUsage> <language ident="oic">Old Icelandic</language> <language ident="onw">Old Norwegian</language> <language ident="osw">Old Swedish</language> <language ident="oda">Old Danish</language> <language ident="oic-onw">Old Icelandic with Old Norwegian traits</language> <language ident="onw-oic">Old Norwegian with Old Icelandic traits</language> <language ident="lat">Latin</language> <language ident="grc">Ancient Greek</language> </langUsage> <handNotes> <handNote xml:id="h1"/> <handNote xml:id="h2"/> </handNotes> </profileDesc>
14.6 The revision description
Even if this is an optional part of the header, it is essential that all changes to the file are recorded.
Each change is described within a separate
<change> element. Within it, the
<date> is given first, then the
<name> of the revisor (preferably with affiliation),
and, finally, a description in prose of the actual change.
A short series of
<change> elements may look like this:
<revisionDesc> <change> <date>2017-07-19</date> <name> <persName>Odd Einar Haugen</persName> <orgName type="affiliation">University of Bergen</orgName> </name>: Revised the encoding in accordance with v. 3.0 of the Menota handbook. </change> <change> <date>2006-04-18</date> <name> <persName>Tone Merete Bruvik</persName> <orgName type="affiliation">Aksis</orgName> </name>: Revised the transcription in accordance with v. 2.0 of the Menota handbook. </change> </revisionDesc>
14.7 Example headers
Examples of Menota headers can be accessed in Appendix E. There are examples of both minimal and longer, more detailed headers.
The first two examples are for minimal headers: One header is for a single-text source, here Holm perg 6 fol (Barlaams ok Josaphats saga) while the other is for a multi-text source, here AM 242 fol (Codex Wormianus). Please note that these two examples show rather minimalistic headers with little information on, for instance, the manuscript source. The third example is more detailed and includes information on material features of the manuscripts and its provenance.