Ch. 10. Normalisation

10.1 Introduction
10.2 Normalisation of lemmata
10.3 Normalisation of texts
10.4 Samples

This is a preliminary version which can be changed or updated at any time.
The revision and updating of this chapter has been assigned to Haraldur Bernharðsson.
Additional text by Alex Speed Kjeldsen, Odd Einar Haugen, Ivar Berg and possibly others.


10.1 Introduction

This chapter discusses the normalisation of the orthography of (1) lemmata and (2) texts in Menota. A strict normalisation of the former is important for the consistency and searchability of the annotated texts in the corpus. This normalisation refers to the contents of the @lemma attribute described in ch. 9.2. The normalisation of texts can not be specified as strictly, partly because there are no commonly accepted normalisation rules for Old Danish and Old Danish and partly because there are some variation also within the rules for Old Icelandic and Old Norwegian (Old Norse). This type of normalisation refers to the regularisation of texts within the <me:norm> element discussed in ch. 3.2 above.


10.2 Normalisation of lemmata

10.3 Normalisation of texts

Medieval (Norse) orthography is not as systematic as modern orthography. Typically, the medieval orthograpy varies from one scribe to another, from one period to another, and, moreover, within the work of the same scribe. This orthographic variation can detract significantly from the readability of the text. In order to make the text more accessible to the readers, the editor may choose to remove this orthographic variation and present the text in a normalised orthography.

The normalisation of a medieval text is, however, not an easy undertaking. The modern editor (obviously) lacks native fluency in both the Old Norse language and the linguistic standard he is imposing on the text (even if the latter is a modern creation).

At the outset, the editor will have to make two practical decisions: (1) Select the norm, and (2) decide the scope of the normalisation.

When it comes to selecting a norm, there are mainly two choices for an edition based on a single manuscript:

(a) Normalisation based on internal criteria. In this kind of normalisation, the language of the manuscript itself would be used as a point of reference. A manuscript from the middle of the 14th century would thus retain (as far as possible) its mid-14th-century characteristics in a normalised edition. This can be a challenging task and not many editions have used normalisation of this sort.

(b) Normalisation based on external criteria. Typically, this requires imposing on the manuscript text an orthographic and linguistic norm from a different period. There are, for example, two main alternatives under this heading when editing Icelandic texts: (i) Classical Old Icelandic normalised orthography, based on the Icelandic language around or shortly after 1200 (on which see more below), and (ii) Modern Icelandic normalised orthography.

The fundamental aim of normalised orthography is to remove orthographic variation that detracts from the readability of the text. In practice, however, the normalisation usually affects different aspects of the language. The orthographic manifestation of sound changes is erased or inserted, morphological forms are altered and sometimes even the word order is changed. The outcome is, therefore, not only an edition with a normalised orthography, but rather with normalised language, and the editor must decide how extensive a normalisation is needed for his intended readership.


10.3.1 Normalisation of Old Norse texts

The orthography of Old Norse texts can be normalised in several ways. In this chapter, we will discuss four (or more?) existing norms, (1) the orthographic normalisation in 18th and 19th century editions, based on the contemporaneous Icelandic orthography, (2) the Íslenzk fornrit norm, (3) the Old Norwegian norm of Gammelnorsk Ordboksverk in Oslo, (4) the norm of the Dictionary of Old Norse Prose in Copenhagen.

Fig. 10.1. Normalisation variation in Old Norse texts

This figure has been made by Robert Paulsen, and might be used (modified) somewhere in the present chapter. Note that Modern Icelandic forms should be replaced with examples from Fritzner’s dictionary. Normalisation in 18th and 19th century editions

When medieval scribes copied manuscripts, they mostly followed their own orthography and also frequently modified the text in other ways, and this practice continued as long as manuscripts were being copied by hand. A conscious attempt to reproduce the original text letter by letter only arose in scholarly treatment of the old manuscripts during the early modern era (e.g. Árni Magnússons apographa, accurate copies of charters). The idea that one should employ a consistent orthography is connected to the emergence of printing. However, the first printed Old Norse texts in the late seventeenth century (by Olof Verelius and his successors in Sweden, Peder Hansen Resen in Denmark, and Þórður Þorláksson in Iceland) continued the manuscript tradition by printing the manuscript at hand rather uncritically and with contemporary Icelandic spelling (Finnur Jónsson 1918: 25–30).

The editions by Det kongelige nordiske Oldskriftselskab, founded 1825 by C.C. Rafn and Rasmus Rask, marks a new practice (the following is based on Berg 2014, where more examples and a more thorough discussion can be found). The editions normalised the spelling, although not very strictly; u and o are for instance both used in endings, and each volume also pays some attention to the manuscript on which it is based. Some orthographic principles were carried out, like “k” where the manuscript varied between “c” and “k”, and a consistent distinction between the vowel “i” and the semivowel “j”. The normalisation was based on Rask’s ideas, as they were presented in his grammars, and to a large degree based on modern Icelandic practice (and pronounciation). There are two main reasons for this: One was Rask’s ideas of a shared orthography for Old and Modern Icelandic, which for instance made him use “â” for former /aː/ which had changed to /oː/, e.g. “vân” for ON ván, MI von. Another reason was faulty understanding of diachronic change in Icelandic, and as historical linguistics made advances the orthography of Old Norse editions was changed to reflect the language of the “Golden age” in the High Middle Ages.

Konráð Gíslason was on of those who contributed to the improved understanding og Old Norse as different from Modern Icelandic, especially through his Um frumparta íslenzkrar túngu í fornöld (1846) and later in various articles. In the syntetic and fully normalised edition of Hrafnkels saga freysgoða (1839), which Konráð published together with P.G. Thorsen, it was explicitly stated that the orthography was supposed to represent the Stand der Forschung.

The first thorough grammar (with an acompanying reader) after Rask’s works was published by P.A. Munch and C.R. Unger in 1847. In some regards they followed Jakob Grimm more than Rask, nevertheless, whether e.g. “á” was said to represent /aː/ or /aw/ did not matter for the normalisation, which was in any case “á”. Munch and Unger did among other things differ between “æ” and “œ”, and their ideas were followed in their own edition of Fagrskinna (1847) and their and Rudolf Keyser’s edition of Konungs skuggsjá (1848). The aim of the normalisation was the romantic idea of the language in its “best period” (cf. Wollin 2000). On the other hand, Barlaams ok Josaphats saga (1851) by Keyser and Unger was normalised according to the Old Norwegian language of the main manuscript (Holm perg 6 fol.), also where lacunae had to be filled from Icelandic manuscripts and one page even translated from Latin. Later Norwegian editions (mainly by Unger) usually followed the manuscript closely, and from Gustav Storm in the late nineteenth century an even stricter diplomatic tradition has dominated Norwegian editions (Haugen 1994: 154).

Building on the founding work on Rask and the further elaborations by especially Konráð Gíslason and the Danish linguist K.J. Lyngby as well as his own studies of the primary sources, Ludvig Wimmer published a new grammar in 1870 (revised Swedish edition 1874) with an acompanying reader; the reader was subsequently revised in several new editions and its normalisation proved to be very influential. Whereas the first edition followed the “usual” normalisation, Wimmer wrote an introduction to the second edition (1877) where he discussed the issue systematically. The aim for normalisation is the language in its classical state, identified as the thirteenth century, and the Icelandic Book of homilies is especially important, not least because of its consistent marking of vowel length, a notoriously difficult matter because of many subsequent changes. (The praise of the Book of homilies was echoed by Adolf Noreen in his grammar (1884)). To reach this classical linguistic state Wimmer replaced some younger forms in the first edition with older ones, e.g. preterite forms of reduplicating verbs like hljópu for younger hlupu and accusative plural of masculine u-stems like sonu for younger syni. The historical approach is evident: The norm is supposed to reflect the oldest known language.

Wimmer’s normalisation has largely been accepted, e.g. in the most widely used editions today, the Íslenzk fornrit series. However, Adolf Noreen critisised it as being unsuitable for a historical phonology, and preferred a more archaic orthography, for instance diferring á and . This represents a point where historical linguistics deviated too much from accustomed spellings to be accepted in text editions, and very few follow this norm. Another point is that following Noreen would remove Old Norse from Modern Icelandic. Even though Rask’s goal of a shared normalisation for the old and the modern language has been abandoned, the differences are not disturbingly big for the Icelandic readership. A point worth mentioning in this context is the use in most normalised editions of double consonants before inflection endings, as in the preterite forms kenndi and byggði (inf. kenna og byggja) as in modern Icelandic. The editions e.g. in Altnordisches Saga-Bibliothek follow the Mainland Scandianavian practice of shortening the consonants in this position, cf. Norwegian kjende and bygde (inf. kjenna and byggja).

Many nineteenth century editions, including the first edition of Wimmer’s reader, points to the “usual” normalisation in text editions, and Berg (2014) claims that the normalisation is thus a result of practice and tradition more than conscious decisions. Nonetheless, the basis is often found in the “best” manuscripts from the thirteenth century, identified as the “classical” period. Icelandic

As already indicated, there are mainly two alternatives when selecting an external standard for presenting a text in Icelandic:

(i) Classical Old Icelandic normalised orthography takes as its point of reference the Icelandic language around or shorty after 1200. This is the standard used by Ludvig F.A. Wimmer in his Oldnordisk læsebog (‘An Old Norse Reader’) in 1877 and has since then been widely used with some minor modifications, in for example the series Altnordisches Saga-Bibliothek and Íslenzk fornrit. This has also been used for the normalisation of lemmata by Ordbog for det nordiske prosasprog (ONP). The following are some of the characteristics of the classical Old Icelandic normalised orthography:

1. Orthographic distinction of the short vowels ǫ : ø

2. Orthographic distinction of the long vowels ǽ : ǿ

3. Orthographic distinction of vowels i : y, í : ý, and ei : ey

4. Etymological vá rendered “vá”: svá, hvárt, vápn

5. Etymological long monophthong é rendered “é”: mér, sér, þér

6. Etymological short monophthong e rendered “e” before ng: lengi

7. Word-final -t in unstressed position: at, þat, hvat

8. Word-final -k in unstressed position: ok, ek, mik

9. The middle voice exponent as -sk: kallask

10. Orthographic distinction of the endings -r and -ur (no epenthetic u): nom. sing. armr : nom.-acc. plur. sǫgur

The editors of the Íslenzk fornrit series, have in some instances employed a slightly younger variant of this standard for 14th-century texts, by, for instance, adopting the vowel mergers ǫ + ø > ö and ǽ + ǿ > æ.

(ii) Modern Icelandic normalised orthography is often used in text editions in Iceland. This requires a fair amount of modernising. The following are some of the characteristics of the Modern Icelandic normalised orthography vis-à-vis the classical Old Icelandic orthography are:

1. The merger of the short vowels ǫ + ø > ö

2. The merger of the long vowels ǽ + ǿ > æ

3. Orthographic distinction of vowels i : y, í : ý, and ei : ey

4. Etymological vá rendered “vo”: svo, hvort, vopn

5. Etymological long monophthong é rendered “é”: mér, sér, þér

6. Etymological short monophthong e rendered “e” before ng: lengi

7. Fricativisation of word-final -t in unstressed position: að, það, hvað

8. Fricativisation of word-final -k in unstressed position: og, eg, mik

9. The middle voice exponent as -st: kallast

10. No orthographic distinction of the endings -r and -ur (epenthetic u): nom. sing. armur : nom.-acc. plur. sögur

In addition, the Modern Icelandic normalisation sometimes incorporates morphological changes, especially in the endings of the verbs. Old Norwegian

Gammelnorsk ordboksverk (The Old Norwegian Dictionary) was established in 1940 with the aim of publishing a dictionary of the Old Norwegian language. For this work, a fixed orthography was needed for the dictionary entries (lemmata), which should correspond to the then established norm for Old Icelandic, but also take into consideration the specific traits of Old Norwegian. The most recent version of these rules is available here in extenso:

GNO rules (June 1982)

The major differences between the GNO norm and the Íslenzk fornrit norm are the following (cf. pp. 8-9 of the rules):

1. The long vowels "á", "é", "í", "ó", "ú" and "ý" should be indicated by accents. However, the long "æ" should not be indicated by an accent [presumably because the short "æ" was not recognised in this orthography], and the long "ø" should be rendered by "œ" and the short "ø" by "ø". In charters (diplomer) dated after 1300, the long "ø" should be indicated by an accent, "ǿ".

2. The vowel "ǫ" (o med kvist) should be rendered by "o", i.e. there should not be any distinction between "o" and "ǫ".

3. The consonant symbols should be the ordinary ones. There should be only one symbol for each of the consonants "r", "s", "f" and "v" (not the round "r", the tall "s" nor the Insular forms of "f" and "v"). The combination "ck" should be spelt "kk".

4. The falling diphthongs should be spelt "ei", "au" and "øy".

5. The rising diphthongs should be spelt "ia", "io" (not with "j"), similarly "-ia" and "-iu" in word-final position ("skilia", "kirkiu").

6. The consonant "h" should be left out in front of "l", "n" and "r" ("lutr" m. etc.).

7. The privative prefix should be "ú" ("úreinn" adj.).

8. The unstressed vowels should be rendered as "i" (for "i" and "e") and "u" (for "u" and "o"), similarly for "-liga" and "-ligr". In other words, there should be no vowel harmony in the orthography.

9. Reflexive verbs should have the ending "-st" ("nálgast" etc.).

10. The consonant combination "ft"/"pt" should be rendered by "pt" ("eptir" prep.).

11. The consonant combination "fn"/"mn" should be rendered by "fn" ("sofna" verb). However, there may be exceptions to this rule, especially for words which are closely connected to another word of the same root, e.g. "samna" verb (cf. "samr" adj.).

12. There should not be any mutation (omlyd) in unstressed positions, e.g. "prédikarum" rather than "prédikurum", "kunnastu" rather than "kunnustu", etc. In editorial comments, however, mutated vowels might be used in this position, e.g. "kolluðum" rather than "kallaðum".


For a historical survey of Gammelnorsk ordboksverk, see Magnus Rindal, "Gammelnorsk ordboksverk 50 år, 1940-1990", Maal og Minne 1991, pp. 29-58. The Dictionary of Old Norse Prose (ONP)

The ONP orthographic norm ....


10.3.2 Normalisation of Old Swedish text

10.3.3 Normalisation of Old Danish text

10.4 Samples

This section illustrates various types of normalisation of texts in the four major Nordic medieval languages.


10.4.1 Old Icelandic

10.4.2 Old Norwegian

The sample below is from the earliest preserved Norwegian codex, AM 619 4º, which has been dated to ca. 1200–1225. Only about 8 fragments from the period ca. 1150–1200 are older. In the two normalisations exemplified below, words that differ are highlighted in red.

Fig. 10.2. The opening of a homily on the Virgin Mary in the Old Norwegian Homily Book, Copenhagen, AM 619 4º, fol. 63r, l. 13–20. Diplomatic transcription

En hælga maria mær moðer drotens várs var
ens bæzta kyns komen fra abraham ok ór kyni da-
uiðs konongs. Hinir nanæsto frændr hennar varo
ret láter ok hofðu mykit crafta lán af guði en litit au-
ra lan af hæimi. En þegar er maria kunni grein
góz ok íllz. þa lagðe hon þegar alla ꜵst við guð sva at
hon var ávalt í guðs þionasto annat tvæggia á bø
num. eða hon hugði at spamanna bocum. eða var í noc[coro] Normalised text according to the GNO rules

En helga Maria mær móðir dróttins várs var
ins bezta kyns komin frá Abraham ok ór kyni Da-
viðs konungs. Hinir nánæstu frændr hennar váru
réttlátir ok hofðu mikit krafta lán af guði en lítit au-
ra lán af heimi. En þegar er Maria kunni grein
góz ok ills, þá lagði hon þegar alla ást við Guð, svá at
hon var ávalt í Guðs þiónastu annat tveggia á bœ-
, eða hon hugði at spámanna bókum, eða var í nok[kuru] Normalised text according to the ONP rules

En helga Maria mǽr móðir dróttins várs var
ins bezta kyns komin frá Abraham ok ór kyni Da-
viðs konungs. Hinir nánǽstu frǽndr hennar váru
réttlátir ok hǫfðu mikit krafta lán af guði en lítit au-
ra lán af heimi. En þegar er Maria kunni grein
góz ok ills, þá lagði hon þegar alla ást við Guð, svá at
hon var ávalt í Guðs þjónustu annat tveggja á bǿ-
, eða hon hugði at spámanna bókum, eða var í nǫk[kuru]


10.4.3 Old Swedish

10.4.4 Old Danish

