Chapter 6. Abbreviations

Version 3.0 (12 December 2019)

by Haraldur Bernharðsson and Odd Einar Haugen

6.1 Introduction

Abbreviations are a common feature of medieval orthography. In the medieval Nordic tradition, abbreviations were used most frequently in Norwegian and Icelandic manuscripts, and particularly in the latter. In some Icelandic manuscripts, as many as a third of the words may be abbreviated, some of them with several abbreviation marks. The system of abbreviations was inherited from English and Continental practice, but the adoption of this system also meant that the usage of some abbreviation marks was extended and it led to the development of some new types.

The encoding of abbreviations is discussed in the TEI P5 Guidelines, ch. 11, particularly ch. 11.3.1.2.. TEI P5 offers on one level the <abbr> element for the encoding of any abbreviations and the <expan> element for the expansion of these abbreviations, and on the next level, the <am> element for the actual abbreviation markers (such as the superscript bar for a nasal) and the <ex> element for the expansion of these markers (in this case “n” or “m”). While the <abbr> and <expan> elements are fully compatible with the schemas used in this handbook, we suggest that it is sufficient to only use the elements <am> and <ex> for the encoding and expansion of abbreviations.

The <am> element may have a @me:type attribute specifying what kind of abbreviation it is. The same applies to the <ex> element. We have not given examples of these attributes in the present chapter, but users may refer to the typology in ch. 6.2 below if they would like to make a more detailed encoding.

Elements & attributes	Obl/Opt	Explanation
<am>		Contains the abbreviation sign used in the manuscript.
@me:type	Optional	Specifies the type of abbreviation.
<ex>		Contains the expansion inserted by the editor, replacing the abbreviation marker.
@me:type	Optional	Specifies the type of expansion.

In a multi-level transcription, the <am> element typically belongs to the facsimile level (<me:facs>), while the <ex> element belongs to the diplomatic level (<me:dipl>). The normalised level (<me:norm>) usually has none, e.g.


<w>
  <choice>
    <me:facs>han<am>&bar;</am></me:facs>
    <me:dipl>han<ex>n</ex></me:dipl>
    <me:norm>hann</me:norm>
  </choice>
</w>

In this chapter, we shall give a typology of abbreviations and then exemplify a number of cases.

6.2 Typology

6.2.1 Overview

Abbreviations are usually divided into four categories (see, e.g., Hreinn Benediktsson 1965, p. 85 and, for a more detailed classification, Kristian Kålund 1907, pp. viii–x):

(1) Suspensions. The first part of the word, often the initial letter only, is written out, followed by a dot or similar mark, e.g., “ſ.” = ſonr ‘son’. The plural may be represented by a doubling of the initial letter, e.g., “ſſ.” = ſynir ‘sons’.

(2) Contractions. Some letters are left out, but the initial and final letters are written out, often one or more of the intermediate as well. The abbreviation is often indicated with a horizontal bar above the word.

(3) Interlinear marks. The interlinear abbreviation is usually a vowel representing either r or v + the vowel itself or a consonant representing a + the consonant itself.

(4) Special signs (brevigraphs). These signs are usually placed on the base line and are thus akin to ordinary letters. The Tironian notae belong to this category.

The typology in ch. 6.3 below takes as its point of departure the location of the abbreviations. The main distinction is drawn between abbreviation signs placed on the base line and those placed above (or through or below) a base line character. From the point of view of character encoding, the former signs can be defined and treated like any other characters, while sthe latter signs must be defined as combining characters, like other diacritical marks.

6.2.2 Glyphs

Glyphs are displayed in the Andron font by Andreas Stötzner. The regular version of this font can be downloaded from the MUFI font page.

Since abbreviation marks typically appear as parts of words and are frequently associated with a base line character we have chosen to illustrate each mark within the context of a whole word.

6.2.3 Entity names

All abbreviations are referred to with entity names, with the exception of full stop, “.”, and colon, “:”. Entity names are placed within the delimiters “&” and “;”, and we have tried to give as short and mnemonic names as possible. As a rule, we have based the entity name on the typical expansion of the abbreviation. Thus, the cross mark which is an abbreviation for kross is given the entity name &cross;.

We aim at synchronizing our use of entities with those recommended by ISO, but since there presently are no abbreviation entities in ISO, we are left to our own devices in this chapter.

6.2.4 Unicode values

The Unicode Standard 12.0 has only defined a handful of abbreviation characters and only a few of interest for our use. The great majority of abbreviation characters must therefore be assigned to codepoints in the Private Use Area. Among the few exceptions are the full stop, colon and semicolon, which are part of the range Basic Latin in Unicode, and the Tironian sign for et, in the range General Punctuation.

For a complete list of suggested Unicode values, see app. A below.

6.2.5 Descriptive names

As is the case with ordinary characters (cf. ch. 5) we adhere to the naming scheme in Unicode. Since The Unicode Standard 12.0 only defines one abbreviation mark in the Latin alphabet, the TIRONIAN SIGN ET in the range General Punctuation, and only one in each of the Armenian, Syriac, Devanagari, Thai and Khmer alphabets, we do not have completley clear examples of descriptive names. We suggest ABBREVIATION SIGN [+ APPROPRIATE DESCRIPTION] as a general name for abbreviations occupying a separate position on the base line, and COMBINING ABBREVIATION MARK [+ APPROPRIATE DESCRIPTION] for those typically placed above, through or below a base line character.

6.3 Abbreviation marks on the base line

Abbreviation marks on the base line behave as any other character. The typology of these abbreviation marks is discussed and exemplified below.

6.3.1 The “et” mark

The Tironian nota resembling the number “7” (or the character “z” with or without a crossbar) used for the conjunction et ‘and’ in Latin is frequently used to denote the corresponding conjunction ok ‘and’ in Old Norse. We recommend using the entity name &et;, reflecting the Latin origin of the abbreviation. In The Unicode Standard 12.0 this character is located at 204A in the range General Punctuation.

There are two major variants of this sign. If the transcriber wishes to make a distinction between these, we suggest using &et; for the sign without a crossbar and >&etslash; for the sign with a crossbar. The code point for the latter is F158.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
⁊	ok	<am>&et;</am>	<ex>ok</ex>	`204A`
	ok	<am>&etslash;</am>	<ex>ok</ex>	`F158`

6.3.2 The “ed” mark

In Latin, the semicolon was used for e + dental consonant, as in the conjunction sed. In Old Norse, it is very often used in the preposition með ‘with’. We recommend &sem; as entity name.

In The Unicode Standard 12.0 the semicolon is located at 003B in the range Basic Latin. When the semicolon is used as a punctuation mark, it should be transcribed as such, i.e., simply as “;”. When it is used as an abbreviation mark we recommend that it is transcribed with an entity, &sem;. Note that there is another form of this abbreviation mark, looking like the number “3”. This is included in the MUFI character recommendation v. 4.0 at codepoint F155 and can be encoded with the entity &etfin;.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
m	með	m<am>&sem;</am>	m<ex>eð</ex>	`F1AC`

6.3.3 The “con” mark

A sign resembling a backwards “c” was often used for con in Latin and kon in Norse words. This “con” mark is similar to 0254 LATIN SMALL LETTER OPEN O in the range IPA Extensions of The Unicode Standard 12.0 and may be identified with this character. Alternatively, it may be rendered by 2184 LATIN SMALL LETTER REVERSED C in the range Number Forms.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ɔa	kona	<am>&oopen;</am>a	<ex>kon</ex>a	`0254`

See the MUFI character recommendation v. 4.0 for other variants of the “con” mark (descending and with a dot).

6.3.4 The “rum” mark

The sequence “rum” was often abbreviated with a character resembling a small version of the number 4 (in fact, it is the round “r” with a stroke across its tail). We recommend the entity name &rum; and a separate code point in the Private Use Area.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
foꝝ	forum	fo<am>&rum;</am>	fo<ex>rum</ex>	`F154`

6.3.5 The cross mark

The word kross was sometimes abbreviated with the cross symbol, which we suggest calling &cross;. This kross mark can be identified with 271D LATIN CROSS in the range Dingbats of The Unicode Standard 12.0 .

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
✝	kross	<am>&cross;</am>	<ex>kross</ex>	`271D`

6.3.6 The “m” rune

The runic character for “m” was sometimes used for the word maðr (including case forms with the stem mann-). We recommend the entity name &mMedrun;. The Unicode Standard 12.0 has defined a selection of 89 runes from the Older and Younger Futhark in the Runic range. This range includes the “m” rune.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ᛘ	maðr	<am>&mMedrun; </am>	<ex>maðr</ex>	`16D8`

The runic character may appear with interlinear marks (“a”, “i”, “e”, “n”, “z”) for various inflected forms of the word maðr, e.g., manna, manni/manne, mann, mannz. The encoding of this type is discussed in ch. 6.4.7 below.

6.3.7 The “f” rune

The runic character for “f” was sometimes used for the word fé. In analogy with the use of the “m” rune, we suggest the entity name &fMedrun;.

The “f” rune is included in the Runic range of The Unicode Standard 12.0 .

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ᚠ	fé	<am>&fMedrun;</am>	<ex>fé</ex>	`16A0`

6.3.8 Dot (full stop)

Dots were often used as abbreviation marks, typically for suspensions, e.g. “ſ.” for sonr (or segja, svara). We recommend that the dot is transcribed in the same manner as a full stop, i.e. with the “.” mark in Basic Latin. Thus, no entity name is called for. The dots sometimes appear on both sides of the abbreviated word, as in “.ſ.”. Both dots serve to indiate an abbreviation and consequently both are left out of the transcription when the abbreviation is expanded. The two <am> elements are thus replaced by a single <ex> element, as shown below.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ſ.	ſonr	ſ<am>.</am>	ſ<ex>onr</ex>	`002E`
.ſ.	ſonr	<am>.</am> ſ<am>.</am>	ſ<ex>onr</ex>	`002E`
.kgr.	konungr	<am>.</am> kgr<am>.</am>	k<ex>onun</ex>gr	`002E`

If the transcriber wishes to distinguish between the dot used as an abbreviation mark and the dot used as a punctuation mark, we suggest that the entity name &period; could be used in the former case and “.” in the latter. However, we believe that there will arise a number of cases where it is difficult to decide whether the dot in the manuscript is a mark of abbreviation, punctuation or both, e.g. when a suspended word is the last word in a sentence. We therefore believe it is better to accept that the full stop is an ambivalent mark, as is also (although to a much lesser extent) the case with the colon and the runic characters “f” and “m”. When the encoder believes that the full stop is an abbreviation mark that should be indicated simply by using the <am> element, as shown here.

6.3.9 Colon

The colon is sometimes, though not often, used as a mark of suspension, in the same manner as the dot (full stop). In analogy with the encoding of dots we suggest transcribing the colon simply as a colon, i.e., without using an entity.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
Rognv:	Rognvaldr	Rognv<am>:</am>	Rognv<ex>aldr</ex>	`003A`

6.3.10 Small capitals

In Old Icelandic, small capitals were used to denote geminated (long) consonants or they were simply used ornamentally (especially in Old Norwegian). Small capitals denoting geminate consonants, such as “ɢ” for the long or geminate gg in contrast to “g” denoting the short g, are comparable to Ancient Greek “ω” denoting the long vowel ō in contrast to “o” denoting the short vowel o, or “η” for the long vowel ē in contrast to “ε” denoting the short e. The small capitals should, therefore, not be treated as abbreviations but rather as symbols in their own right and encoded as small capitals on both facsimile and diplomatic levels. The use of small capitals is rarely consistent and systematic and sometimes they may stand for consonants that were undoubtedly short. Rather than attempting to differentiate accurately between small capitals with phonological value and those that are superfluous, it seems more practical to simply reproduce all small capitals in both the facsimile and diplomatic transcriptions.

Small capitals may also appear with a superscript dot; for example, “ʀ̇”. These should not be treated mechanically interpreting the superscript dot as a signal of length, as that is almost certainly not what the scribe had in mind. Instead, small capitals with a superscript dot should be transcribed as such both in a facsimile and a diplomatic transcription.

6.4 Combining abbreviation marks

The majority of abbreviation marks are placed above, through or below a base line character. It could be argued that they really refer to the whole word, but from an analytical point of view we recommend that they are encoded immediately after the base line character to which they seem most closely associated. Cf. the rules in ch. 5.2.1.

It is sometimes difficult to decide whether a sign is placed on the base line or above another base line character. For example, the “us” mark (cf. ch. 6.4.3 below) may sometimes occupy a position of its own, although slightly raised above the base line. The classification in this chapter is based on what we believe are the prototypical positions of the abbreviation marks.

6.4.1 Horizontal bar

The horizontal bar is from a historical point of view the earliest form of an abbreviation mark and it is also the most ambiguous type. It is commonly used for “m” or “n” and is often referred to as a “nasal stroke”, but it is also used in a number of other contexts, as a mark of suspension or contraction. We recommend using the same entity name in all instances, &bar;. The unmarked position of the bar is above the immediately preceding character.

This horizontal bar is partially similar to 0304 COMBINING MACRON and 0305 COMBINING OVERLINE in the range Combining Diacritical Marks of The Unicode Standard 12.0, and may be identified with the latter.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ū	um	u<am>&bar;</am>	u<ex>m</ex>	`0305`
allā	allan	alla<am>&bar;</am>	alla<ex>n</ex>	`0305`
en̅	enn	en<am>&bar;</am>	en<ex>n</ex>	`0305`
p̅	preſtr	p<am>&bar;</am>	p<ex>reſtr</ex>	`0305`
ꝥ	þat	þ<am>&bar;</am>	þ<ex>at</ex>	`0305`
ħ	hann	h<am>&bar;</am>	h<ex>ann</ex>	`0305`

In the last two examples, the bar crosses the ascender of the characters “þ” and “h”. In our view, this is only a coincidence, since the bar in all cases is placed above the x height of the base line character. If there is a character with an ascender, the bar will simply cross this stroke.

The unmarked position of the bar is above the base line character, and this is therefore part of the definition of the entity &bar;. In some cases the bar may be placed below the base line character. Here, we suggest the entity name &barbl; (for “bar below”).

The horizontal bar below is partially similar to 0331 COMBINING MACRON BELOW or 0332 COMBINING LOW LINE in the range Combining Diacritical Marks of The Unicode Standard 12.0, and may be identified with the latter.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ꝧ	þeir	þ<am>&barbl;</am>	þ<ex>eir</ex>	`0305`
ꝑ	per	p<am>&barbl;</am>	p<ex>er</ex>	`0305`

It is possible to identify various shapes of the horizontal bar. In general we recommend that the transcriber should not make more distinctions than strictly necessary. If the transcriber for some reason would like to create a typology of bar forms, we suggest that this is done by numbering, &bar-1;, &bar-2;, &bar-3;, etc. The meaning of each entity must be explained in the header of the transcription and specified in the entity list (cf. app. D below)

6.4.2 Flourish

The flourish may be described as a horizontal bar with a return. It appears in the abbreviation of the Latin word pro in contradistinction to “per”, which typically is abbreviated with a simple horizontal bar. We suggest using the entity name &combflour; and recommend that it is given a separate code point in the Private Use Area.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ꝓꝼaꞇ	profat	p<am>&combflour;</am>&fins;a&trot;	p<ex>ro</ex>fat	`F1C6`

6.4.3 The us mark

Originally a Tironian nota, a mark resembling a small version of the number “9” is often used for us. It is usually placed in a raised position, though not always clearly above the preceding character. Since the typical position of this mark is above the base line, we regard it as a combining mark and suggest the entity name &us; and recommend that it is given a separate code point in the Private Use Area.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
h᷒	hus	h<am>&us;</am>	h<ex>us</ex>	`F15B`
la᷒	laus	la<am>&us;</am>	la<ex>us</ex>	`F15B`

6.4.4 The er mark

A mark resembling a zigzag was frequently used as abbreviation of a front vowel (including diphtongs) + “r”, e.g. “ir”, “er”, “eir”, “ær”. The earliest form resembles a horizontal stroke with a descender to the left and an ascender to the right. It later acquired a zigzag-like form and even later resembles the letter “u” turned upside-down. This abbreviation mark has now become part of the Unicode Standard (based on its usage in Lithuanian) in the range Combining diacritical marks.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
v͛a	vera	v<am>&er;</am>a	v<ex>er</ex>a	`035B`
va	vera	v<am>&ercurl;</am>a	v<ex>er</ex>a	`F1C8`
heꝼ	hefir	he&fins;<am>&ercurl;</am>	hef<ex>ir</ex>	`F1C8`

6.4.5 The ra mark

Originally an open form of the character “a”, this mark was used as an abbreviation for ra or va. One variant resembles the Greek letter Omega and another variant an Omega with a horizontal bar above. We suggest using the entity name &ra; for the first type and &rabar; for the second. We recommend that the latter is given a separate code point in the Private Use Area.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
sᷓ	sva	s<am>&ra;</am>	s<ex>va</ex>	`1DD3`
ꝼ	fra	&fins;<am>&rabar;</am>	f<ex>ra</ex>	`F1C1`

6.4.6 The ur mark

The syllable ur (sometimes yr) can be abbreviated by a mark resembling a small version of the numeral “2”. A second form of this mark resemble a tilde, and a third form a horizontal version of the number “8” (equal to the lemniskate symbol), cf. Hreinn Benediktsson 1965, p. 91. Due to the considerable variation in form we suggest that it might be useful to distinguish between three main forms, using the entity &urrot; for the first type, &ur; for the second and &urlemn; for the third. The code points are respectively F153, 1DD1 and F1C2 (the first and the third of these in the Private Use Area).

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ock	ockur	ock<am>&urrot;</am>	ock<ex>ur</ex>	`F153`
b᷑þ	burþ	b<am>&ur;</am>þ	b<ex>ur</ex>þ	`1DD1`
ſpþe	ſpurþe	ſp<am>&urlemn;</am>þe	ſp<ex>ur</ex>þe	`F1C2`

6.4.7 Interlinear characters

Interlinear characters are a common type of abbreviation. An interlinear vowel typically represents a consonant (often r) + the vowel itself, while an interlinear consonant typically represents a vowel (often a) + the consonant itself. We suggest that interlinear abbreviation marks are named by the character itself + “sup” (for “superscript”), e.g. &asup; (interlinear “a”), &osup; (interlinear “o”), &rscapsup; (interlinear small capital “r”), etc.

The Unicode Standard 12.0 includes a selection of more than 30 superscript characters in the ranges Combining Diacritical Marks, 0363–036F and Combining Diacritical Marks Supplement, 1DD4–1DF4. We suggest that these characters are used to display interlinear characters and that characters outside this selection are given separate code points in the Private Use Area. See the MUFI character recommendation v. 4.0 for further specifications.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
bͦg	borg	b<am>&osup;</am>g	b<ex>or</ex>g	`0366`
mͣ	manna	m<am>&asup;</am>	m<ex>anna</ex>	`0363`
vþa	virþa	v<am>&inodotsup;</am>þa	v<ex>ir</ex>þa	`0365`
þegͬ	þegar	þeg<am>&rsup;</am>	þeg<ex>ar</ex>	`036C`

The runic character “m”, which itself can be used as an abbreviation (cf. ch. 6.3.6 above), can appear with an interlinear abbreviation mark. The encoding follows the pattern above.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ᛘͣ	manna	<am>&mMedrun; &asup;</am>	<ex>manna</ex>	`16D8 + 0363`

Since the first entity, &mMedrun;, is defined as a base line character and the second, &asup;, as an interlinear mark placed above the immediately preceding base line character, there will be no doubt as to the positioning.

6.4.8 Superscript dots

Superscript dots are sometimes used to denote length. It is a moot question whether this is a type of abbreviation, but in any case the transcriber should use an entity for the encoding. We recommend that superscript dots are transcribed in analogy with other combining abbreviation marks and suggest using the entity name &combdot; (for combining dot above).

The Unicode Standard 12.0 has a combining dot above in the range Combining diacritical marks.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
leġıa	leggia	leg<am>&combdot;</am>&inodot;a	leg<ex>g</ex>ia	`0307`

6.5 Special cases

6.5.1 Nomina sacra

In some cases, the whole word must be analysed as an abbreviation. This applies to the traditional nomina sacra, i.e. abbreviations for sacred words such as Iesus and Christus. These contain characters which originally were Greek but might be taken for Latin characters. For example, the “p” in “xpm” is originally a Greek “rho” (“r”).

We believe these abbreviations should be encoded as a sequence of the individual base line characters and one or more combining bars above. In the examples below, the originally Greek base line characters have been identified with the similar-looking Latin characters. Greek characters might also have been used in the encoding (such as &igr; for GREEK SMALL LETTER IOTA, etc.).

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
ıc	iesus	<am>ıh&bar;c;</am>	<ex>iesus</ex>	`0305`
xp̅m	christum	<am>xp&bar;m</am>	<ex>christum</ex>	`0305`

Typically in abbreviations of this sort, the bar extends over neighbouring letters. To replicate that, the combining bar above can be encoded as part of the neighbouring letters as well, even if it may be argued that there is only a single bar in each example, and that this bar simply happens to extend over more than one character. This problem is discussed more fully in ch. 6.5.5 below.

6.5.2 Interlinear characters in other contexts

Interlinear (superscript) characters are used in various ways, not always as abbreviations. According to Andrea de Leeuw van Weenen 2000, pp. 36–43 there are four types:

(a) as abbreviation

This type is discussed in ch. 6.4.7 above. Here, we recommend the usage of the <am> element and entities such as &asup; and the like.

(b) as addition

When superscript characters are used for adding characters which are missing on the base line we recommend that they are encoded by use of the element <add> and the attribute @place="supralinear" (cf. ch. 9.2.1.1). One might argue that if the scribe has added the superscript character as part of the writing process, it should not be regarded as an addition, but rather as an ordinary character which just happened to be placed over the line. However, it is difficult to make a distinction between characters written above the line as part of the writing process and characters that were added later, maybe shortly afterwards, whether by the same scribe or by another scribe. As a general rule we therefore recommend encoding supralinear characters by the <add> element and the attribute @place="supralinear". There is no need for an entity of the type &asup; since the location of the character is indicated by the element.

Manuscript form	Display in an edition	Encoding
hanͣ	han⸌a⸍	han<add place="supralinear">a</add>

(c) as complementation of Roman numbers

Inflected forms of Roman numbers are sometimes specified by interlinear characters. In these cases, the interlinear characters are not placed above any base line character but merely raised above the base line. We suggest encoding them by appropriate entities, such as &trotsup; and &inodotsup;. Note that these interlinear characters are combining, so in order to avoid a position above the last base line character, an empty space entity,  , has been added before each raised character:

Manuscript form	Display in an edition	Encoding
v.  	v.  	v. &trotsup; &inodotsup;

(d) as space savers

Especially at the end of a line one or more characters may be placed above the last word to save place and complete the line. In these cases, we suggest that they are interpreted and encoded as part of the writing process, thus not using the <add> element, but by superscript entities, such as &ssup; for a superscript “s”.

Manuscript form	Display in an edition	Encoding
eᷤ	eᷤ	e&ssup;

6.5.3 Missing abbreviation mark

From time to time one can find examples of a word that obviously is abbreviated but where there is no trace of the abbreviation mark. For example, a single “d” (with no punctuation mark) may be used for the word “dróttning”. There is then no alternative but transcribing the text as it reads in the manuscript.

This is what we would recommend on the facsimile level of transcription. On the diplomatic and normalised levels, we recommend using the <supplied> element, i.e. “d<supplied>rottning</supplied>”.

6.5.4 Nesting (stacking) of abbreviation marks

There are a few examples of base line characters which are abbreviated with an abbreviation mark which is itself abbreviated. An example is the base line character “m” with an interlinear “o” which in turn has a horizontal bar. According to rule 6 in ch. 5.2.1 above this abbreviation should be encoded as the sequence m + &osup; + &bar;.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion
mͦ̅	monnom	m<am>&osup;&bar;</am>	m<ex>onnom</ex>

Since &osup; is defined as a combining character, it follows that it is placed above the immediately preceding character, in this case “m”, and since &bar; is also defined as a combining character, it follows that it is placed above &osup;. There is therefore no doubt as to the positioning of each part.

6.5.5 Extension of abbreviation marks

As a rule, combining abbreviation marks are associated with a single base line character. Thus, the sequence m&osup; means that the interlinear character “o” is seen as being placed above “m” and not above any other character. However, some abbreviation marks extend over more than one character. For example, the word “kirkia” may be abbreviated with a horizontal bar crossing both the first and the second “k”. We believe it is sufficient to associate the abbreviation mark with only one of these characters, preferably the first.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion
k̅k̅ıa	kirkia	k<am>&bar;</am> k&inodot;a	k<ex>ir</ex>kia

It is possible to encode this word in such a way that the bar is associated with both characters. This is in a sense closer to the manuscript form, but it means that a single abbreviation mark may appear as two distinct marks (unless it is somehow stated that the two marks belong together). Thus, this is a more complex and possibly misleading solution.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion
k̅k̅ıa	kirkia	k<am>&bar;</am>k<am>&bar;</am>&inodot;a	k<ex>ir</ex>kia

On the other hand, it should be noted that this a case where 0305 COMBINING OVERLINE is appropriate, since it connects to left and right. Cf. the reference in ch. 6.4.1 above.

6.5.6 Sporadic ligatures with abbreviation marks

We recommended that sporadic ligatures should not be encoded by use of separate entities but by the element <seg> with the attribute @type="lig". A sporadic ligature is basically a joining of two base line characters which together do not reflect a separate phonological value. This is the case with ligatures such as “s+k” and “p+p” which in this respect are identical to “s” + “k” and “p” + “p”.

See ch. 4.3 above for further examples.

Manuscript form	Expanded form	Encoding
	pp	<seg type="lig">pp</seg>

However, some ligatures are formed in such a manner that it is difficult to distinguish the separate parts. That applies to the ligature of long “s” + “h”, “k” and “þ”. In these cases, we suggest that it is advisable to use individual entities. These characters must be referred to the Private Use Area.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
	hanſ	<am>&hslonglig;</am>	h<ex>an</ex>ſ	`EBAD`
	konungſ	<am>&kslonglig;</am>	k<ex>onung</ex>ſ	`EBAE`
	þeſſ	<am>&thornslonglig; </am>	þ<ex>eſ</ex>ſ	`E734`

Often, a horizontal bar is used across these ligatures. The bar may be encoded separately with its usual entity, &bar; (cf. ch. 6.4.1 above) or with a character located in the Private Use Area.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
	konungſ	<am>&kslonglig; &bar;</am>	k<ex>onung</ex>ſ	`EBAE + 0305`
	konungſ	<am>&kslongligbar; </am>	k<ex>onung</ex>ſ	`E7C8`

6.5.7 The character “r” as interlinear ligature

A quite special type of abbreviation is interlinear “r” in ligature with e.g. “þ”. The encoding is similar to the examples in ch. 6.5.6 above.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
	þar	<am>þ&rsup;</am>	þ<ex>a</ex>r	`E8C1`

6.5.8 Sharp “s”

In late Old Norwegian, the “sharp s” appears in a number of abbreviations, e.g. for skilling, smør and son. The German character “sharp s” is defined in The Unicode Standard 12.0 as 00DF LATIN SMALL LETTER SHARP S in the range Latin-1 Supplement. We recommend using the ISO entity ß also when this character is used as an abbreviation mark. The element<am> will indicate clearly that it is an abbreviation mark, not an ordinary character. See the discussion on the full stop in ch. 6.3.8 above.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
Hakonß	Hakonson	Hakon<am>ß </am>	Hakon<ex>son</ex>	`00DF`

6.5.9 The semicolon abbreviation

In Old Swedish, the semicolon character is often used as an abbreviation for the front vowel “e” or “æ” + the dental “t” or “dh”, typically in words like “medh” and “thet” (Old Norse með and þat). The expansion of the abbreviated syllable varies, and will in many cases be based on the usage in the individual manuscripts.

Abbreviated form	Expanded form	Encoding of abbreviation	Encoding of expansion	Code point
mꝫ	medh	m<am>&etfin;</am>	m<ex>edh</ex>	`A76B`
thꝫ	thet	th<am>&etfin;</am>	th<ex>et</ex>	`A76B`

There is a long-standing tradition for rendering this abbreviation character as “z”, e.g. “mz” and “thz”. This goes back to 17th century editions of medieval texts, and it has been kept in a large number of later scholarly editions. We recommend expanding this abbreviation according to the spelling of these words in full, as suggested in the table above. It cannot be denied, however, that this rule requires some investigation into the manuscript orthography when deciding the exact expansion.

6.6 List of abbreviation marks

An extensive list of abbreviation characters is found in the MUFI character recommendation, cf. app. A below.