Chapter 10. Manuscript description

Version 1.0 (20 May 2003)


10.1 Introduction
10.2 The manuscript description tagset
10.3 The manuscript identifier and manuscript heading elements
10.4 Intellectual content
10.5 Codicological features
10.6 The history of the manuscript
10.7 Other information
10.8 Names of persons, places and institutions, and bibliographical references


10.1 Introduction

This chapter deals with the description &emdash; rather than transcription &emdash; of manuscripts and other primary source materials using TEI-compatible XML, more specifically the tagset developed by the MASTER project and the TEI workgroup on manuscript description. Although this tagset was designed to work within the larger TEI encoding scheme, and makes use of many of the standard TEI elements that are treated elsewhere in the present handbook, a number of the elements described in this chapter are new and have at the time of writing not yet been formally integrated with the rest of the TEI structure. Their use therefore requires the definition of a TEI extension set comprising two files, msDesc.ent and msDesc.dtd, the former containing definitions of parameter entities and the latter the actual element and attribute definitions which make up the required modifications. These files, available here, should be placed in the same directory as the other TEI DTD files, while the document type declaration subset the individual documents should include the following declarations:

<!ENTITY % TEI.extensions.ent SYSTEM "msDesc.ent">
<!ENTITY % TEI.extensions.dtd SYSTEM "msDesc.dtd">

Until such time as these elements are formally adopted by the TEI, readers are referred to the MASTER reference manual, available online at, for a full list of the elements available for use in manuscript descriptions.


10.2 The manuscript description tagset

The <msDescription> element is the framing element into which the manuscript description is put. This element can appear within <sourceDesc> in the document header, providing information &emdash; so-called metadata, or data on data &emdash; on the source of which the body of the document is a transcription or excerpt (or a collection of digital images). This need consist of no more than the basic information necessary to identify the source, i.e. its location, both geographical and institutional, and its shelfmark or other identifying number or name (e.g. Oslo, Universitetsbibliotek, UB 1042 8vo), but it is also possible to provide a detailed description of the source, analagous to what one would find in the introduction to a scholarly edition. Alternatively, the body of the document can consist of nothing but <msDescription> elements, as in a traditional manuscript catalogue. Here again, the descriptions, or catalogue entries, can be simple, comprising no more than a few lines, or rich in detail and highly structured, spanning dozens or even hundreds of pages. In both cases, whether the <msDescription> element is in the header or body of the document, it is up to the editor, cataloguer or scholar to decide how much detail he or she wishes to include: the overall structure of the source description is essentially the same. It is worth pointing out that the same <msDescription> can easily serve both functions, prefacing a transcription of the manuscript or as a single entry in a catalogue. An <msDescription> can also appear anywhere within a TEI conformant document that a paragraph (<p> can, which means a structured source description could be embedded within ordinary prose, for example in a scholarly monograph.

Within <msDescription> the following seven elements are available. Of these, only the first is mandatory:

<msIdentifier>: groups information that uniquely identifies the manuscript, i.e. its location, holding institution and shelfmark.

<msHeading>: contains a brief structured description of the manuscript, including a uniform or supplied title, information on place and date of origin, and the language or languages of the contents.

<msContents>: contains an itemised list of the intellectual content of the manuscript or manuscript part, with transcriptions of rubrics, incipits, explicits etc, as well as primary bibliographic references.

<physDesc>: groups information concerning all physical aspects of the manuscript or manuscript part, its material, size, format, script, decoration, binding, marginalia etc.

<history>: provides information on the history of the manuscript or manuscript part, its origin, provenance and acquisition by its holding institution.

<additional>: groups other information about the manuscript, in particular, administrative information relating to its availability, custodial history, additional materials associated with it, surrogates etc.

<msPart>: contains in essence a nested <msDescription>, in cases of composite manuscripts now regarded as constituting a single unit but made up of two or more parts which were originally physically distinct; since the contents, physical description and history of the individual parts will normally be quite different, an <msPart> element can contain all the elements listed above, with the exception of <msIdentifier> and <msHeading>.

Within each of these elements a number of sub-elements is available; <msContents>, for example, will normally consist of one or more <msItem> elements, each in turn containing specific elements for <rubric>, <incipit>, <explicit> and <colophon>, as well as the standard TEI elements <author>, <title> and <bibl> for bibliographic references. The contents need not be this structured, however, since with all the elements listed above, apart from <msIdentifier> and <msHeading>, there is also the option of using ordinary prose, marked up with the <p> element. Doing so would limit greatly the possibilities both for processing and searching the data, but could be preferable when dealing with pre-existing descriptions (so-called ‘legacy data’), the exact form of which one may wish, or be required, to maintain.


10.3 The manuscript identifier and manuscript heading elements

The only mandatory element, as was said above, is <msIdentifier>. There is a number of sub-elements available within <msIdentifier>: <country>, <region>, <settlement> (the TEI term for what most people would call city), <institution>, <repository>, <collection> and <idno>, all of which are self-explanatory. Of these, only <settlement>, <repository> and <idno> are required, since they provide what is, by common consent, the minimum amount of information necessary to identify a manuscript. In many cases, no other elements are needed, as common sense will suffice to distinguish, say, Paris, France from Paris, Texas, as the location of the Bibliothèque Nationale. For search purposes, however, it is a good idea to include as much information as possible, such as <country> and, where applicable, <region>. There is one more optional element, <altName>, which can be used for a former shelfmark or some name other than the standard shelfmark by which a manuscript is known; the manuscript Uppsala, Universitätsbibiliothek, DG 1, for example, is far better known under the name Codex Argenteus or the ‘Silver Bible’. There are many examples of nicknames among the manuscripts in the Arnamagnæan Collection, as Árni Magnússon frequently gave his manuscripts names based on the places where they had been made or where he got them from, or the people whom he knew to have produced or possessed them. Occassionally a manuscript can have several such names, or perhaps rather several forms of the name, typically in different languages. These can be dealt with through the lang attribute, which is available on all TEI elements.

A typical <msIdentifier> for a manuscript in the Arnamagnæan collection looks like this:

  <country reg="DK">Danmark</country>
  <repository>Det Arnamagnæanske Institut</repository>
  <idno>AM 45 fol.</idno>
  <altName type="nickname" lang="LAT">Codex Frisianus</altName>
  <altName type="nickname" lang="ISL">Fríssbók</altName>

The value of the reg attribute on <country>, which is the standard international two-letter code, is for search purposes: one could find all manuscripts in Danish repositories regardless of whether the cataloguer has given the name of the country as ‘Denmark’ or ‘Danmark’ (or, for that matter, ‘Dänemark’, ‘Dinamarca’ or ‘Дания’). There are many such attributes in the manuscript description tagset which allow for cross-language searches.

The <msHeading> element is intended to provide a short summary description of the manuscript, which, if used in tandem with <msIdentifier>, may be said to constitute a minimal or first-level description, whether the <msDescription> is to be used in the header or in the body of the document.

The elements available within <msHeading> are <author>, <title>, <origPlace>, <origDate>, <textLang> and <note>, which can be used to provide information on the manuscript which is of particular importance or interest but not covered by the other elements. The <note> element is repeatable, and can be given a type attribute, if it is thought necessary to distinguish between different kinds of notes (on the other hand, if one feels the need to distinguish between many different kinds of notes, it is probably preferable to use the specific tags described below). The following, the <msHeading> for the Arnamagnæan manuscript AM 1 e β I fol., is typical:

  <title type="uniform" lang="ISL">Sögubrot af nokkrum fornkonungum í Dana ok Svía veldi</title>
  <origDate notBefore="1275" notAfter="1325">c. 1300</origDate>
  <textLang langKey="NON">Old Norse/Icelandic</textLang>
  <note>This manuscript and some fragments of <title>Knýtlinga saga</title> in AM 20b I fol. are presumed originally to have belonged together. Together these fragments constitute the work known as <title>Skjöldunga saga</title>.</note>

Here, too, the attributes provide information for search purposes, such as the notBefore and notAfter attributes on <origDate>.

It would, as was said, be possible to stop here, as all the basic information on the source has been provided in a structured and searcable way. In many cases, however, one will wish to describe the source in greater detail.


10.4 Intellectual content

Although there is a <title> element in <msHeading>, it is used for a uniform title (e.g. Brennu-Njáls saga) or a supplied title, which describes the contents of the manuscript as a whole (e.g. ‘Collection of rímur’). Detailed description of a manuscript’s contents is put in the <msContents> element, which consists of one or more <msItem> elements, prefaced, if desired, by an <overview> element, when only some of the items are to be described in detail.

<msItem> elements are allowed to ‘nest’, by which is meant that an <msItem> can contain other <msItem> elements; this is useful where separate items in a manuscript are grouped under a single title or rubric, for example in collections of prayers.

A defective attribute, with possible values of ‘yes’, ‘no’ or ‘unk’, is available on <msItem>, providing a useful means of distinguishing between texts which are fragmentary and those which are not. The attribute is also available on the specialised elements for <incipit> and <explicit>. When dealing with collections of fragments, each fragment may be given as a separate <msItem> and the first and last words of each transcribed as defective incipits and explicits, as in the following example, a manuscript containing four fragments of a single work:

  <msItem defective="yes"><locus from="1r" to="9v">1r-9v</locus>
  <title>Knýtlinga saga</title>
    <msItem n="1.1"><locus from="1r:1" to="2v:30">1r:1-2v:30</locus>
      <incipit defective="yes">dan<expan>n</expan>a a engl<expan>an</expan>di</incipit>
      <explicit defective="yes">en meðan har<expan>aldr</expan> hein hafði k<expan>onung</expan>r v<expan>er</expan>it yf<expan>ir</expan> danmork</explicit>
    <!-- msItems 1.2 to 1.4 -->

The standard TEI element <bibl> (and the grouping ‘parent’ element <listBibl>) is also available within <msItem>. This should be used to provide bibliographical information on the <msItem> level, i.e. concerning editions of the item in question. Bibliographical information pertaining to the manuscript as a whole can be placed in the <additional> element, described below.


10.5 Codicological features

The next major element in an <msDescription> is <physDesc>, i.e. physical description, within which there are available specialised elements for <form>, i.e. whether the ‘text object’ is a codex, roll, tablet etc., <support>, i.e. whether written on parchment, paper etc., and a description thereof, <extent>, the number and size of leaves, <collation>, a description of the quire structure, any missing leaves and so on, <layout>, the number of columes, dimensions of the written area, number of lines per page/column etc., <bindingDesc>, a description of the present binding, if any, and information on any former bindings, <foliation>, how and, if known, when and by whom the manuscript has been paginated/foliated, <musicNotation>, a description of any musical notation, <additions>, i.e. a description and/or transcription of any marginalia, glosses etc. in the manuscript, and <condition>, for a description of the present physical state of the manuscript. All these elements contain one or more paragraphs tagged with the standard TEI <p> element, which means that the content can range from a single word or phrase to the equivalent of several pages in a printed book.

There are also several elements available within <physDesc> for which one has the option of using a series of specific sub-elements. The <msWriting> element is intended for a description of the scribal hand or hands of the manuscript. This too may simply contain one or more <p> elements, but can also consist of a series of <handDesc> elements, each of which contains a prose description of one of the hands, marked up with <p>. The level of detail in these descriptions is determined entirely by the scholar or cataloguer. The following is an example of a short <msWriting> element:

<msWriting hands="1">
  <p>Written in <term type="script" reg="Hybrida">Gothic hybrid</term>. The scribe is unknown but the same hand is found on sections of AM 23 4to and Gl. kgl. S. 25 fol.</p>

The use of the TEI element <term>, with attributes type and reg (‘regularised’), allows for more precise searching than would be possible with free text (but is obviously dependent on there being a commonly agreed taxonomy).

In the following, where a <handDesc> element is used to describe an individual hand (one of six in the manuscript), the script attribute is used to indicate the type of script. Note also the SCOPE element, with possible values of ‘major’, ‘minor’ and ‘sole’.

<msWriting hands="1">
  <handDesc script="Hybrida" scope="major"><p>The main hand (Hand 1) writes <locus>ff. 1r-9r and 16r-118v</locus> in a practised Gothic hybrid.</p></handDesc>
  <!-- more handDesc elements -->

Here the <locus> element, which we saw above in <msItem>, is used to indicate specifically which parts of a manuscript are written in a given hand.

As the content of the <handDesc> element is ‘p+’, i.e. one or more paragraphs, there is no limit to the amount of information which may be given on any single hand. Thus, a detailed analysis of palaeographical and orthographical features (‘/a/ is of the two-storey kind’, etc.) is perfectly possible within this overall structure.

There is a corresponding element, <decoration> for the description of illumination and other decorational features in the manuscript. <decoration>, like <msWriting>, may simply contain one or more paragraphs, or a sequence of topically organised sub-elements, called <decoNote>s, each describing either a decorative component of a manuscript (e.g. a single illuminated initial) or a homogenous class of such components (e.g. illuminated initials generally).

A large number of attributes is available on <decoNote>, including type (e.g. ‘initial’), subtype (e.g. ‘historiated’) and technique (e.g. ‘pen and wash’); there are also the attributes figurative and illustrative, with possible values ‘yes’, ‘no’ or ‘unknown’ (in the latter case ‘y’, ‘n’ or ‘u’). All these may be used in order to facilitate sophisticated searches.

The following is an example of a typical <decoNote>:

<decoNote type="secondary" subtype="initial" figurative="no" illustrative="n">
  <p>There are red initials on ff. 4r, 5v, 8r, 91r, 95r, 100r, 101r, 102r, 104r, 107r, 108r, 110r, 111r, 112r, 113 and 116r.</p>

The standard TEI <list> element can also be used if one wishes to list separately the individual instances of a particular type of decoration, rather than using separate <decoNote> elements:

<decoNote type="miniature" technique="fully coloured" figurative="yes" illustrative="y">
  <p>The manuscript is decorated with 48 framed miniatures depicting scenes from the life of Christ and the life of the Virgin.
      <item n="1"><locus>2v</locus><term>Pietà</term>; the dead Christ supported by the the Virgin Mary.</item>
      <!-- other items -->


10.6 The history of the manuscript

The <history> element contains information on the history of the manuscript. Available within it are just three sub-elements: <origin>, for information on when, where and, if known, by whom the manuscript was written, <provenance>, in which any evidence of ownership and use is provided, and <acquisition>, which describes when and how the manuscript was acquired by its holding institution. Each of these elements contains one or more paragraphs. Alternatively, as with the other major elements in a manuscript description, the <history> element may itself consist simply of one or more paragraphs in which the entire history of the manuscript is given (or, as the case may be, not given, if nothing is known of the manuscript’s previous history).

For manuscripts in the Arnamagnæan collection the principal source of information on the manuscript’s history will be Árni Magnússon himself, who frequently provided details on how he had come to possess the manuscript and anything he had been able to discover about its previous owners. This information is generally written on small paper slips which are kept with the manuscript, usually bound into the front or back, or separately in the manuscript AM 435 a 4to. One may wish to provide a full transcription of these comments within the <provenance> (or <acquisition>) element, as in the following example:

  <p>According to AM 435 a 4to, ff. 54v-56v, the manuscript had been owned by <name type="person" role="owner">Sr. Þórður Jónsson á <name type="place">Staðastað</name> (1672-1720)</name>, who had got it from <name type="person" role="owner">Jón Hákonarson að <name type="place">Vatnshorni</name> (c. 1658-1748)</name>, who had in turn got it from <name type="person" role="owner">Þorgeir Jónsson (c. 1661-1742)</name>, <foreign>ráðsmaður</foreign> at <name type="place">Hólar</name> and brother of Bishop <name type="person">Steinn Jónsson</name>. Þorgeir had got the manuscript, probably in 1696 or 97, at <name type="place">Kalastaðir</name>, <name type="place">Hvalfjarðarströnd</name> from <name type="person" role="owner">Þórður Illugason</name>, son of <name type="person">Illugi Vigfússon</name> (c. 1570-1634), son of <name type="person">Vigfús Jónsson, <foreign>sýslumaður</foreign> (d. c. 1595)</name>. Þorgeir's wife, <name type="person">Margrét Guðmundsdóttir</name>, and Þórður Illugason, who had no children of his own, were related (<foreign>þrímenningar</foreign>).</p>
  <p>The full text of Árni's comments reads:
    <q><p>Compendium Historiæ Norvegicæ, undiqve mutilum, alias fragmentum rarissimum. 4to minori. Komid til min fra Þordi Jonssyne. en<expan>n</expan> fyrer þ<expan>ad</expan> var þad i eigu Þorgeirs Jonssonar, sem þad feck...</p><pb/>
    <p>Fragmentum historiæ Norvegicæ in octavo /:þad sem eg feck af Þorde Jonssyne, en<expan>n</expan> han<expan>n</expan> af Jone Hakonarsyne/: eignadest Þorgeir Jonsson /:mägur Gudmundar Arnarsonar i Heynese/: ä Kalastødum ä Hvalfiardar strønd fyrer 10. eda 11. ärum (fra 1707. ad reikna) Þad hafdi næst f<expan>irir</expan> han<expan>n</expan> ätt Þordur Jllu<pb/>gason Vigfussonar, brodurson Orms i Eyum, og høfdu þesse blød vered langfedga eign þeirra fedga allt fra Vigfuse Jonssyne fordum Syslum<expan>anni</expan> i Kios, secundum traditionem þ<expan>ess</expan> folks.</p>
<p>Þegar han<expan>n</expan> feck þesse blød, voru þau eins mutila & nu eru þau. <del rend="overstrike">hefur</del> var & þar sem Þorgeir þau feck, eck<expan>er</expan>t <pb/>meira, ecke helldr neinstadar þar um kring ä strøndinne, so vïtt Þorgeir inqvirerad gat, sem han<expan>n</expan> segest m<expan>ed</expan> flid giørt hafa.</p>
    <p>Eingar utskrifter ætlar Þorgeir þar af vera, ad vïsu seigest han<expan>n</expan> eckert slikt nockurn tïma sied hafa. Dixit coram 1707.</p>
    <p>Þorgeir atti eigi leinge þetta fragment, helldur feck þ<expan>ad</expan>, so mutilum sem <pb/>þad var, Jone Hakonar syne, en<expan>n</expan> h<expan>an</expan>n Þorde Jons syne sem adr er sagt.</p>
    <p>Jon Hakonar son af mi<expan>er</expan> adspurdr, meinar eingar utskrifter þar af vera i landinu, og seigest alldri þvilïkt neitt, fyrr edur sidar, sied hafa.</p></q></p>
  <p>This agrees with the information found on the second (of four) Arnamagnæan slip, which reads: <q>Eignarm<expan>en</expan>n þ<expan>ess</expan>a fragm<expan>en</expan>ts hafa nylegast vered <list><item>Þorgeir Jonsson.</item><item>Jon Hakonarson.</item><item>Þordr Jonsson.</item><item>Eg.</item></list></q></p>

It should be noted the various mechanisms for the transcription of primary sources described elsewhere in this handbook, expansion of abbreviations and so on, may be employed here as well.


10.7 Other information

The final large grouping element in a manuscript description is, appropriately enough, the <additional> element. The first subsection of this element is called <adminInfo>, which, as its name suggests, contains information pertaining to the curation and management of the manuscript. Such information would not normally form part of the introduction to a scholarly edition, but there is no reason why it could not be included in the document header. Subelements available here include <custodialHist>, in which information can be given on such matters as conservation, loans and exhibitions and so on, either as a series of paragraphs or one or more dated <custEvent> elements, and the standard TEI element <availability>, for information on the availability of the manuscript, for example any restrictions on its use or access etc.

Also available within <additional> is a <surrogates> element for information on photographic reproductions. Here it would be possible to provide information on, and links to, any digital reproductions which may be available of the manuscript.

Another element available within <additional> is <accMat>, for ‘accompanying material’, in which any additional material, not originally part of the manuscript but bound with it or otherwise accompanying it, can be described and/or transcribed. The Arnamagnæan slips, mentioned above, for example, might better be dealt with here, if the information they contain is not principally concerned with the manuscript’s history or provenance.

Finally, the element <listBibl> is available within <additional> for bibliographical information pertaining to the manuscript as a whole, rather than individual text-items, which, as was mentioned above, should rather be given under the appropriate <msItem>.


10.8 Names of persons, places and institutions, and bibliographical references

Most of the elements that have been mentioned so far have the character of boxes into which information of a certain type can be fitted. But it will be noted in the examples cited that there are other kinds of elements which can appear anywhere within the document, so-called ‘phrase-level elements’, of which there is a large number available within any TEI-conformant document. These are primarily used in oder to facilitate certain types of processing and/or for search purposes. All names, for example, are tagged as such, using the <name> element, with a type attribute to indicate whether they are the names of persons, places or organisations (such as religious orders). More detailed information about persons can be provided in a <listPerson> element within the header’s <profileDesc>, using the standard TEI <person> element, to which the value of the key attribute refers. The individual <person> elements provide information on birth, death, residence and occupation, either as one of more paragraphs of running prose, or through the use of specialised sub-elements, and there are also attributes to indicate the gender and role of the person.

In the description of the provenance of AM 435 a 4to, cited above, instead of providing birth and death dates and so on for each of the persons mentioned, one could refer using the key attribute on name to an external <person> element, such as the following, for Þórður Jónsson:

<person id="ThorJon" sex="m" role="owner">
  <persName lang="ISL">Þórður Jónsson</persName>
  <birth notBefore="1672" notAfter="1672">1672</birth>
  <death notBefore="1720-08-21" notAfter="1720-08-21">21 August 1720</death>
      <settlement type="farm">Staðastaður</settlement>
      <region type="parish">Staðarsveit</region>
      <region type="county">Snæfellsnessýsla</region>
      <region type="compass">Western</region>
      <country reg="IS">Iceland</country>

Treating names in this way means that each person is uniquely identified with an ID, to which all individual instances of that person’s name then refer, whatever form those instances take. This solves the problem not only of variant spellings but also where, for example, a medieval author is known by a Latin name and any number of vernacular forms, many or all of which may have claims to ‘authenticity’. In order to ensure uniformity, the method generally employed in the library world has been to accept the form found in some authority file, for example that of the American Library of Congress, as the ‘base’ or ‘neutral’ form. Feelings can run high on this matter, however, and people are frequently reluctant to accept as ‘neutral’ an overtly ‘foreign’ form of the name of some local saint or hero. Within the <person> tag any number of variant forms of a name can be given, with no prioritisation, and hence, less likelihood of offense. The chief advantage of treating persons in this way, however, is for searching, in particular once one has put together a large body of material. It is possible not only to search for persons with a particular names, but also born in a particular place at a particular time. The <person> elements takem as a whole can also function as a reference tool, a veritable Who’s who in medieval and early-modern Scandinavia. The possibilities as regards scribes are especially exciting, as it would be a relatively easy matter to add images to the <person> elements showing the hand or hands of each scribe, making it possible eventually to produce a register of all known scribes, searchable in terms of date, location etc.

It is possible to treat bibliographical references in a similar way. Since many of the same works are likely to be referred to again and again it would seem most sensible to provide full bibliographical information only once, in a separate bibliography, to which all bibliographical references in the individual records could then point.

The following is a typical bibliographical record as found in the separate bibliography file:

<bibl id="StudIsl24">
  <author>Ólafur Halldórsson</author>
  <title level="m">Helgafellsbækur fornar</title>
  <title level="s">Studia Islandica</title>
  <biblScope type="vol">XXIV</biblScope>

While in the description of AM 238 VII fol., one of the manuscripts discussed in the article, the bibliographical reference is given using a <ref> element within <bibl>, as follows:

<bibl><ref target="StudIsl24">Ólafur Halldórsson 1966</ref>, pp. 18 and 22</bibl>

As with the <listPerson> file, the bibliography file &emdash; which can in effect become an authorised bibliography of studies in the medieval Scandinavian philology &emdash; can be searched and browsed separately, making it a valuable tool for scholars.


 Top of page


Preliminary version created 11 March 2002. Version 1.0 published 20 May 2003.