Appendix I: Menota Light

Version 3.0 (12 December 2019)

by Odd Einar Haugen and Nina Stensaker

I.1 Introduction

This handbook is complex and deals with a number of questions that may not arise in the practical work for many transcribers. So, as stated in the very first chapter of the handbook, a text encoded according to the Menota Guidelines need not be complicated at all. It may do with just a handful of elements and attributes.

In this appendix, we are offering an example of a single-level transcription of a small extract from NRA 58 C, an early Norwegian fragment of Konungs skuggsjá. The full encoding, using all three levels of representation can be found in the text archive under NRA 58 C, and the underlying XML file can be downloaded there. In the present example, the transcription is on a single, diplomatic level, and there is no morphological annotation. However, the way Menota organises its files, further levels can easily be added later on, and the same goes for the morphological annotation.

I.2 The source

NRA 58 C is a fragment of four folios written in two columns, having a total of 2,981 words. We have selected the first 18 lines of folio 3r, column B, as our example. This extract contains 151 words, so it is in a sense a fragment of a fragment.

Ill. I.1. A fragment of Konungs skuggsjá. NRA 58 C, fol. 3r, col. B, l. 1–18.

I.3 The encoding

Below is the encoding of the text. In order to make the file easier to read, each word is set on a line of its own. The header is not included, but you may refer to ch. 14 for examples of minimal as well as more extensive ones.


<text xml:lang="onw">
  <body>
    <div>
      <p>
        <cb ed="ms" n="B"/>
        <lb ed="ms" n="1"/>
        <w>En</w>
        <w>ef</w>
        <w>þat</w>
        <w>lægi</w>
        <w>ner</w>
        <w>auðrum</w>
        <w>londum</w>
        <w>þa</w>
        <lb ed="ms" n="2"/>
        <w>mundi</w>
        <w>þ<ex>a</ex>t</w>
        <w>vera</w>
        <w>kallat</w>
        <w>þriðiungr</w>
        <w>af</w>
        <w>einum</w>
        <lb ed="ms" n="3"/> 
        <w>biscupsdome</w>
        <pc>.</pc>
        <w>en</w>
        <w>þo</w>
        <w>hafa</w>
        <w>þeir</w>
        <w>ser</w>
        <w>nu</w>
        <w>b<ex>is</ex>cup</w>
        <lb ed="ms" n="4"/>
        <w>þvi at</w>
        <w>eigi</w>
        <w>lyðir</w>
        <w>annat</w>
        <w>firir</w>
        <w>sva</w>
        <w>mikillar</w>
        <lb ed="ms" n="5"/>
        <w>fiarvistar</w>
        <w>sacker</w>
        <w>er</w>
        <w>þeir</w>
        <w>ero</w>
        <w>við</w>
        <w>aðra</w>
        <w>me<ex>n</ex></w>
        <lb ed="ms" n="6"/>
        <w>En</w>
        <w>þar</w>
        <w>er</w>
        <w>þu</w>
        <w>leitar</w>
        <w>eiptir</w>
        <w>þvi</w>
        <w>við</w>
        <w>hvart</w>
        <lb ed="ms" n="7"/>
        <w>er</w>
        <w>þeir</w>
        <w>lifa</w>
        <w>a</w>
        <w>þvi</w>
        <w>lande</w>
        <pc>.</pc>
        <w>með</w>
        <w>þvi</w>
        <w>at</w>
        <w>þar</w>
        <lb ed="ms" n="8"/>
        <w>er</w>
        <w>eicki</w>
        <w>sað</w>
        <w>a</w>
        <pc>.</pc>
        <w>oc</w>
        <w>lifa</w>
        <w>þo</w>
        <w>menn</w>
        <w>a</w>
        <w>þeim</w>
        <lb ed="ms" n="9"/>
        <w>loundum</w>
        <pc>.</pc>
        <w>þvi at</w>
        <w>við</w>
        <w>flera</w>
        <w>lifa</w>
        <w>menn</w>
        <w>en</w>
        <lb ed="ms" n="10"/>
        <w>við</w>
        <w>brouð</w>
        <w>eit</w>
        <pc>.</pc>
        <w>En</w>
        <w>a</w>
        <w>þesso</w>
        <w>lande</w>
        <w>er</w>
        <w>nu</w>
        <w>røð<lb ed="ms" n="11"/>dom</w>
        <w>vit</w>
        <w>um</w>
        <pc>.</pc>
        <w>þa</w>
        <w>er</w>
        <w>sva</w>
        <w>sact</w>
        <w>ifra</w>
        <pc>.</pc>
        <w>at</w>
        <w>þar</w>
        <w>se</w>
        <lb ed="ms" n="12"/>
        <w>gros</w>
        <w>goð</w>
        <pc>.</pc>
        <w><ex>oc</ex></w>
        <w>ero</w>
        <w>þar</w>
        <w>bu</w>
        <w>goð</w>
        <w>oc</w>
        <w>stor</w>
        <pc>.</pc>
        <w>þvi at</w>
        <w>me<ex>n</ex></w>
        <lb ed="ms" n="13"/>
        <w>hafa</w>
        <w>þar</w>
        <w>nouta</w>
        <w>mart</w>
        <w>oc</w>
        <w>souða</w>
        <pc>.</pc>
        <w>oc</w>
        <w>er</w>
        <w>þar</w>
        <lb ed="ms" n="14"/>
        <w>smiorgærð</w>
        <w>mikill</w>
        <pc>.</pc>
        <w>oc</w>
        <w>osta</w>
        <pc>.</pc>
        <w>oc</w>
        <w>lifa</w>
        <w>menn</w>
        <w>við</w>
        <lb ed="ms" n="15"/>
        <w>þat</w>
        <w>mioc</w>
        <w>oc</w>
        <w>sva</w>
        <w>við</w>
        <w>kiot</w>
        <pc>.</pc>
        <w>oc</w>
        <w>við</w>
        <w>alzkonar</w>
        <lb ed="ms" n="16"/>
        <w>veiði</w>
        <pc>.</pc>
        <w>bæðe</w>
        <w>við</w>
        <w>reina</w>
        <w>hold</w>
        <w>oc</w>
        <w>hvala</w>
        <w>oc</w>
        <w>sela</w>
        <lb ed="ms" n="17"/>
        <w>oc</w>
        <w>biarnar</w>
        <w>hold</w>
        <pc>.</pc>
        <w>oc</w>
        <w>føðaz</w>
        <w>menn</w>
        <w>við</w>
        <w>þ<ex>a</ex>t</w>
        <w>þar</w>
        <lb ed="ms" n="18"/>
        <w>a</w>
        <w>lande</w>
      </p>
    </div>
  </body>
</text>

I.4 Comments

As can be seen in the example, the whole text is embedded in the <text> element. The @xml:lang attribute which has the ‘onw’ value states that the language is Old Norwegian. Next, the text is embedded in the <body> element, which allows the encoder to add a <front> and <back> section to the transcription. Next, a single <div> element embeds the example – usually, a <div> will contain longer passages than this, such as chapters. Finally, there is a <p> element, stating that the text is in prose, as opposed to e.g. <lg> for metrical text. These elements were introduced and explained in ch. 3 above.

Note that the <pb/> and <cb/> elements precede the text. They have the @ed attribute with the ‘ms’ value, stating that the page and column beginnings refer to the source itself, and not, for example, an edition. It also has the @n attribute specifying the line number. Note that each <lb/> element precedes the line to which it refers. When a word is written over two lines, such as in lines 10–11, the <lb/> element will be placed at the break within the word itself. See ch. 3.12 and ch. 5.5 above for details.

The final two elements are the domineering ones: <w> for each word and <pc> for each punctuation character. As explained in ch. 3.6 above, the reason for embedding words in the <w> element is primarily to make it ready for morphological annotation. It is also a convenient way of dealing with the distinction between graphical and lexical words, as shown in ch. 5.3.2 above.

Then, there is only one element left, the <ex> element for the expansion of abbreviations. There are not many examples here, but one is found in line 2, where “þat” has been abbreviated with a bar across the ascender of the “þ”. If the transcription had been on the facsimile level, this abbreviation mark would have been encoded with the <am> element, leaving it to the diplomatic level to expand the mark. You may download the multi-level transcription of NRA 58 C from the Menota archive to see how this works.

I.5 File for downloading

We offer the XML file used in this appendix for downloading. This is the complete file, including the header:

NRA 58 C: a fragment of a fragment

Please note that some browsers may try and interpret and open this sample file. In order to download the file to your disk, use alt-click (Mac) or right-click (Windows) on your browser, unless your browser has other preferences.