Chapter 6. Abbreviations

Version 3.0 beta

This is a preliminary version which can be changed or updated at any time.
The revision and updating of this chapter has been assigned to Haraldur Bernharðsson.

6.1 Introduction

Abbreviations are a common feature of medieval orthography. In the medieval Nordic tradition, abbreviations were used most frequently in Norwegian and Icelandic manuscripts, and particularly in the latter. In some Icelandic manuscripts, as many as a third of the words may be abbreviated, some of them with several abbreviation marks. The system of abbreviations was inherited from English and Continental practice, but the adoption of this system also meant that the usage of some abbreviation marks was extended and it led to the development of some new types.

For encoding abbreviations, we recommend using two elements:

Element Contents
<am> (abbreviation marker) contains the abbreviation sign used in the manuscript
<ex> (editorial expansion) contains the expansion inserted by the editor, replacing the abbreviation marker

The encoding of abbreviations is discussed in the TEI P5 Guidelines in ch. 11, particularly ch. 11.3.2. In addition to the elements above, TEI P5 offers the <abbr> element for abbreviations spanning a whole word, such as the nomen sacrum “xpc” which is an abbreviation for “christus”. We recommend using only the elements <am> and <ex> .

In a multi-level transcription, the <am> element typically belongs to the facsimile level ( <me:facs> ), while the <ex> element belongs to the diplomatic level ( <me:dipl> ). The normalised level ( <me:norm> ) usually has none, e.g.


<w>
  <choice>
    <me:facs>han<am>&bar;</am></me:facs>
    <me:dipl>han<ex>n</ex></me:dipl>
    <me:norm>hann</me:norm>
  </choice>
</w>

The <am> element may have a @me:type attribute specifying what kind of abbreviation it is. The same applies to the <ex> element. We have not given examples of these attributes in the present chapter, but users may refer to the typology in ch. 6.2 below if they would like to make a more detailed encoding.

Element Contents
<am> contains the abbreviation sign used in the manuscript
    @me:type specifies the type of abbreviation (optional)
<ex> contains the expansion inserted by the editor, replacing the abbreviation marker
    @me:type specifies the type of expansion (optional)

In this chapter, we shall give a typology of abbreviation and then exemplify a number of cases.

6.2 Typology

6.2.1 Overview

Abbreviations are usually divided into four categories (see, e.g., Hreinn Benediktsson 1965, p. 85 and, for a more detailed classification, Kristian Kålund 1907, pp. viii-x):

(1) Suspensions. The first part of the word, often the initial letter only, is written out, followed by a dot or similar mark, e.g., “ſ.” = ſonr 'son'. The plural may be represented by a doubling of the initial letter, e.g., “ſſ.” = ſynir 'sons'.

(2) Contractions. Some letters are left out, but the initial and final letters are written out, often one or more of the intermediate as well. The abbreviation is often indicated with a horizontal bar above the word.

(3) Interlinear marks. The interlinear abbreviation is usually a vowel representing either r or v + the vowel itself or a consonant representing a + the consonant itself.

(4) Special signs (brevigraphs). These signs are usually placed on the base line and are thus akin to ordinary letters. The Tironian notae belong to this category.

The typology in ch. 6.3 below takes as its point of departure the location of the abbreviations. The main distinction is drawn between abbreviation signs placed on the base line and those placed above (or through or below) a base line character. We suggest that letter-sized characters on the base line be referred to as signs, while combining abbreviation marks (above, through or below another character) are referred to as marks. For the sake of simplicity, however, we shall refer to both categories as marks in this chapter.

6.2.2 Glyphs

Glyphs are displayed in the Andron font by Andreas Stötzner (Leipzig). The regular version of this font can be downloaded from the MUFI font page.

Since abbreviation marks typically appear as parts of words and are frequently associated with a base line character we have chosen to illustrate each mark within the context of a whole word.

6.2.3 Entity names

All abbreviations are referred to with entity names, with the exception of full stop, “.”, and colon, “:”. Entity names are placed within the delimiters “&” and “;”, and we have tried to give as short and mnemonic names as possible. As a rule, we have based the entity name on the typical expansion of the abbreviation. Thus, the cross mark which is an abbreviation for kross is given the entity name “&cross;”.

We aim at synchronizing our use of entities with those recommended by ISO, but since there presently are no abbreviation entities in ISO, we are left to our own devices in this chapter.

6.2.4 Unicode values

Unicode 5.0 has only defined a handful of abbreviation characters and only a few of interest for our use. The great majority of abbreviation characters must therefore be defined as code values in the Private Use Area. The only exceptions are the full stop, colon and semicolon, which are part of the range Basic Latin in Unicode, and the Tironian sign for et, in the range General Punctuation.

For a complete list of suggested Unicode values, see Appendix A below.

6.2.5 Descriptive names

As is the case with ordinary characters (cf. ch. 5) we adhere to the naming scheme in Unicode. Since Unicode 5.0 only defines one abbreviation mark in the Latin alphabet, the TIRONIAN SIGN ET in the range General Punctuation, and only one in each of the Armenian, Syriac, Devanagari, Thai and Khmer alphabets, we do not have completley clear examples of descriptive names. We suggest ABBREVIATION SIGN “000” as a general name for abbreviations occupying a separate position on the base line, and COMBINING ABBREVIATION MARK “000” for those typically placed above, through or below a base line character.

6.3 Abbreviation marks on the base line

Abbreviation marks on the base line behave as any other character. The typology of these abbreviation marks is discussed and exemplified below.

6.3.1 The “et” mark

The Tironian nota resembling the number “7” (or the character “z” with or without a crossbar) used for the conjunction et 'and' in Latin is frequently used to denote the corresponding conjunction ok 'and' in Old Norse. We recommend using the entity name “&et;”, reflecting the Latin origin of the abbreviation. In Unicode 5.0 this character is located at 204A in the range General Punctuation.

There are two major variants of this sign. If the transcriber wishes to make a distinction between these, we suggest using “&et;” for the sign without a crossbar and “&etslash;” for the sign with a crossbar. The code point for the latter is F158.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Abbreviation mark code point
ok <am> &et; </am> <ex> ok </ex> 204A
ok <am> &et; </am> <ex> ok </ex> F158

6.3.2 The “ed” mark

In Latin, the semicolon was used for e + dental consonant, as in the conjunction sed. In Old Norse, it is very often used in the preposition með 'with'. We recommend “&sem;” as entity name.

In Unicode 5.0 the semicolon is located at 003B in the range Basic Latin. When the semicolon is used as a punctuation mark, it should be transcribed as such, i.e., simply as “;”. When it is used as an abbreviation mark we recommend that it is transcribed with an entity, “&sem;”. Note that there is another form of this abbreviation mark, looking like the number “3”. This is included in the MUFI character recommendation at code point F155 and can be encoded with the entity “&etfin;”.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Abbreviation mark code point
m m <am> &sem; </am> m <ex> </ex> F1AC

6.3.3 The “con” mark

A sign resembling a backwards “c” was often used for con in Latin and kon in Norse words. This “con” mark is similar to 0254 LATIN SMALL LETTER OPEN O in the range IPA Extensions of Unicode 5.0 and may be identified with this character.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Abbreviation mark code point
kona <am> &oopen; </am> a <ex> kon </ex> a 0254

See the MUFI character recommendation for other variants of the “con” mark (descending and with a dot).

6.3.4 The “rum” mark

The sequence “rum” was often abbreviated with a character resembling a small version of the number 4 (in fact, it is the round “r” with a stroke across its tail). We recommend the entity name “&rum;” and a separate code point in the Private Use Area.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Abbreviation mark code point
forum fo <am> &rum; </am> fo <ex> forum </ex> F154

6.3.5 The cross mark

The word kross was sometimes abbreviated with the cross symbol, which we suggest calling “&cross;”.

This kross mark can be identified with 271D LATIN CROSS in the range Dingbats of Unicode 5.0.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Abbreviation mark code point
kross <am> &cross; </am> <ex> kross </ex> 271D

6.3.6 The “m” rune

The runic character for “m” was sometimes used for the word maðr (including case forms with the stem mann-). We recommend the entity name “&mMedrun;”, as introduced in ch. 5.3.7.

Unicode 5.0 has defined a selection of 81 runes from the Older and Younger Futhark in the Runic range. This range includes the “m” rune.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Abbreviation mark code point
maðr <am> &mMedrun; </am> <ex> maðr </ex> 16D8

The runic character may appear with interlinear marks (“a”, “i”, “e”, “n”, “z”) for various inflected forms of the word maðr, e.g., manna, manni/manne, mann, mannz. The encoding of this type is discussed in ch. 6.4.7 below.

6.3.7 The “f” rune

The runic character for “f” was sometimes used for the word . In analogy with the use of the “m” rune, we suggest the entity name “&fMedrun;”.

The “f” rune is included in the Runic range of Unicode 5.0.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Abbreviation mark code point
<am> &fMedrun; </am> <ex> &fMedrun; </ex> 16A0

6.3.8 Dot (full stop)

Dots were often used as abbreviation marks, typically for suspensions, e.g. “ſ.” for sonr (or segja, svara). We recommend that the dot is transcribed in the same manner as a full stop, i.e. with the “.” mark in Basic Latin. Thus, no entity name is called for.

The dots sometimes appear on both sides of the abbreviated word, as in “.ſ.”. Both dots serve to indiate an abbreviation and consequently both are left out of the transcription when the abbreviation is expanded. The two <am> tags are thus replaced by a single <ex> tag, as shown below.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
ſ. ſonr ſ<am>.</am> ſ<ex>onr</ex> 002E
.ſ. ſonr <am>.</am>ſ<am>.</am> ſ<ex>onr</ex> 002E
.kgr. konungr <am>.</am>kgr<am>.</am> k<ex>onun</ex>gr 002E

If the transcriber wishes to distinguish between the dot used as an abbreviation mark and the dot used as a punctuation mark, we suggest that the entity name “&period;” could be used in the former case and “.” in the latter. However, we believe that there will arise a number of cases where it is difficult to decide whether the dot in the manuscript is a mark of abbreviation, punctuation or both, e.g. when a suspended word is the last word in a sentence. We therefore believe it is better to accept that the full stop is an ambivalent mark, as is also (although to a much lesser extent) the case with the colon and the runic characters “f” and “m”. When the encoder believes that the full stop is an abbreviation mark that should be indicated simply by using the <am> element, as shown here.

6.3.9 Colon

The colon is sometimes, though not often, used as a mark of suspension, in the same manner as the dot (full stop). In analogy with the encoding of dots we suggest transcribing the colon simply as a colon, i.e., without using an entity.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
Rognv: Rognvaldr Rognv<am>:</am> Rognv<ex>aldr</ex> 003A

6.3.10 Small capitals

In Old Icelandic, small capitals were used to denote geminated (long) consonants or they were simply used ornamentally (especially in Old Norwegian). In ch. 5.3.3 above we recommended that they were encoded as entities in both cases. Small capitals denoting geminate consonants, such as “ɢ” for the long or geminate gg in contrast to “g” denoting the short g, are comparable to Ancient Greek “ω” denoting the long vowel ō in contrast to “o” denoting the short vowel o, or “η” for the long vowel ē in contrast to “ε” denoting the short e. The small capitals should, therefore, not be treated as abbreviations but rather as symbols in their own right and encoded as small capitals on both facsimile and diplomatic level. The use of small capitals is rarely consistent and systematic and sometimes they may stand for consonants that were undoubtedly short. Rather than attempting to differentiate accurately between small capitals with phonological value and those that are superfluous, it seems more practical to simply reproduce all small capitals in both the facsimile and diplomatic transcriptions.

Small capitals may also appear with a superscript dot; for example, “ʀ̇”. These should not be treated mechanically interpreting the superscript dot as a signal of length, as that is almost certainly not what the scribe had in mind. Instead, small capitals with a superscript dot should be transcribed as such both in a facsimile and a diplomatic transcription.

6.4 Combining abbreviation marks

The majority of abbreviation marks are placed above, through or below a base line character. It could be argued that they really refer to the whole word, but from an analytical point of view we recommend that they are encoded immediately after the base line character to which they seem most closely associated. Cf. the rules in ch. 2.2.1.

It is sometimes difficult to decide whether a sign is placed on the base line or above another base line character. For example, the “us” mark (cf. ch. 6.4.3 below) may sometimes occupy a position of its own, although slightly raised above the base line. The classification in this chapter is based on what we believe are the prototypical positions of the abbreviation marks.

6.4.1 Horizontal bar

The horizontal bar is from a historical point of view the earliest form of an abbreviation mark and it is also the most ambiguous type. It is commonly used for “m” or “n” and is often referred to as a “nasal stroke”, but it is also used in a number of other contexts, as a mark of suspension or contraction. We recommend using the same entity name in all instances, “&bar;”. The unmarked position of the bar is above the immediately preceding character.

This horizontal bar is partially similar to 0304 COMBINING MACRON and 0305 COMBINING OVERLINE in the range Combining Diacritical Marks of Unicode 5.0, and may be identified with the latter.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
ū um u<am>&bar;</am> u<ex>m</ex> 0305
allā allan alla<am>&bar;</am> alla<ex>n</ex> 0305
en enn en<am>&bar;</am> en<ex>n</ex> 0305
preſtr p<am>&bar;</am> p<ex>reſtr</ex> 0305
þat &thorn;<am>&bar;</am> &thorn;<ex>at</ex> 0305
ħ hann h<am>&bar;</am> h<ex>ann</ex> 0305

In the last two examples, the bar crosses the ascender of the characters “þ” and “h”. In our view, this is only a coincidence, since the bar in all cases is placed above the x height of the base line character. If there is a character with an ascender, the bar will simply cross this stroke.

The unmarked position of the bar is above the base line character, and this is therefore part of the definition of the entity “&bar;”. In some cases the bar may be placed below the base line character. Here, we suggest the entity name “&barbl;” (for “bar below”).

The horizontal bar below is partially similar to 0331 COMBINING MACRON BELOW or 0332 COMBINING LOW LINE in the range Combining Diacritical Marks of Unicode 5.0, and may be identified with the latter.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
þeir &thorn;<am>&barbl;</am> &thorn;<ex>eir</ex> 0305
per p<am>&barbl;</am> p<ex>er</ex> 0305

It is possible to identify various shapes of the horizontal bar. In general we recommend that the transcriber should not make more distinctions than strictly necessary. If the transcriber for some reason would like to create a typology of bar forms, we suggest that this is done by numbering, “&bar-1;”, “&bar-2;”, “&bar-3;”, etc. The meaning of each entity must be explained in the header of the transcription and specified in the entity list (cf. Appendix D below)

6.4.2 Flourish

The flourish may be described as a horizontal bar with a return. It appears in the abbreviation of the Latin word pro in contradistinction to “per”, which typically is abbreviated with a simple horizontal bar. We suggest using the entity name “&combflour;” and recommend that it is given a separate code point in the Private Use Area.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
ꝓꝼaꞇ profat p<am>&combflour;</am>&fins;a&trot; p<ex>ro</ex>fat F1C6

6.4.3 The us mark

Originally a Tironian nota, a mark resembling a small version of the number “9” is often used for us. It is usually placed in a raised position, though not always clearly above the preceding character. Since the typical position of this mark is above the base line, we regard it as a combining mark and suggest the entity name “&us;” and recommend that it is given a separate code point in the Private Use Area.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
h᷒ hus h<am>&us;</am> h<ex>us</ex> F15B
la᷒ laus la<am>&us;</am> la<ex>us</ex> F15B

6.4.4 The er mark

A mark resembling a zigzag was frequently used as abbreviation of a front vowel (including diphtongs) + “r”, e.g. “ir”, “er”, “eir”, “ær”. The earliest form resembles a horizontal stroke with a descender to the left and an ascender to the right. It later acquired a zigzag-like form and even later resembles the letter “u” turned upside-down. This abbreviation mark has now become part of the Unicode Standard (based on its usage in Lithuanian) in the range Combining diacritical marks.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
v͛a vera v<am>&er;</am>a v<ex>er</ex>a 035B
va vera v<am>&ercurl;</am>a v<ex>er</ex>a F1C8
heꝼ hefir he&fins;<am>&ercurl;</am> hef<ex>ir</ex> F1C8

6.4.5 The ra mark

Originally an open form of the character “a”, this mark was used as an abbreviation for ra or va. One variant resembles the Greek letter Omega and another variant an Omega with a horizontal bar above. We suggest using the entity name “&ra;” for the first type and “&rabar;” for the second. We ꝼ recommend that both marks are given separate code points in the Private Use Area.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
sᷓ sva s<am>&ra;</am> s<ex>va</ex> F157
ꝼ fra &fins;<am>&rabar;</am> f<ex>ra</ex> F1C1

6.4.6 The ur mark

The syllable ur (sometimes yr) can be abbreviated by a mark resembling a small version of the numeral “2”. A second form of this mark resemble a tilde, and a third form a horizontal version of the number “8” (equal to the lemniskate symbol), cf. Hreinn Benediktsson 1965, p. 91. Due to the considerable variation in form we suggest that it might be useful to distinguish between three main forms, using the entity “&urrot;” for the first type, “&ur;” for the second and “&urlemn;” for the third. The code points are respectively F153, F1C3 and F1C2 (all in the Private Use Area).

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
ſþe ſpurþe &slong;p<am>&urlemn;</am>&thorn;e &slong;p<ex>ur</ex>&thorn;e F1C2
ock ockur ock<am>&urrot;</am> ock<ex>ur</ex> F153

6.4.7 Interlinear characters

Interlinear characters are a common type of abbreviation. An interlinear vowel typically represents a consonant (often r) + the vowel itself, while an interlinear consonant typically represents a vowel (often a) + the consonant itself. We suggest that interlinear abbreviation marks are named by the character itself + “sup” (for “superscript”), e.g. “&asup;” (interlinear “a”), “&osup;” (interlinear “o”), “&rscapsup;” (interlinear small capital “r”), etc.

Unicode 5.0 includes a selection of 13 superscript characters, namely “a”, “e”, “i”, “o”, “u”, “c”, “d”, “h”, “m”, “r”, “t”, “v”, “x”. They are located at the end of the range Combining diacritical marks, 0363-036F. We suggest that these characters are used to display interlinear characters and that characters outside this selection are given separate code points in the Private Use Area.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
bͦg borg b<am>&osup;</am>g b<ex>or</ex>g 0366
manna m<am>&asup;</am> m<ex>anna</ex> 0363
vþa virþa v<am>&inodotsup;</am>&thorn;a v<ex>ir</ex>&thorn;a 0365
þegͬ þegar þeg<am>&rsup;</am> þeg<ex>ar</ex>& 036C

The runic character “m”, which itself can be used as an abbreviation (cf. ch. 6.3.6 above), can appear with an interlinear abbreviation mark. The encoding follows the pattern above.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
ᛘͣ manna <am>&mMedrun;&asup;</am> <ex>manna</ex>& 16D8 + 0363

Since the first entity, “&mMedrun;”, is defined as a base line character and the second, “&asup;”, as an interlinear mark placed above the immediately preceding base line character, there will be no doubt as to the positioning.

6.4.8 Superscript dots

Superscript dots are sometimes used to denote length. It is a moot question whether this is a type of abbreviation, but in any case the transcriber should use an entity for the encoding. We recommend that superscript dots are transcribed in analogy with other combining abbreviation marks and suggest using the entity name “&combdot;” (for “combining dot above”).

Unicode 5.0 has a combining dot above in the range Combining diacritical marks.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
leġıa leggia leg<am>&combdot;</am>&inodot;a leg<ex>g</ex>ia 0307

6.5 Special cases

6.5.1 Nomina sacra

In some cases the whole word must be analysed as an abbreviation. This applies to the traditional nomina sacra, i.e. abbreviations for sacred words such as Iesus and Christus. These contain characters which originally were Greek but might be taken for Latin characters. For example, the “p” in “xpm” is originally a Greek “rho” (“r”).

We believe these abbreviations should be encoded as a sequence of the individual base line characters and one or more combining bars above. In the examples below, the originally Greek base line characters have been identified with the similar-looking Latin characters. Greek characters might also have been used in the encoding (such as “&igr;” for GREEK SMALL LETTER IOTA, etc.).

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
i̅h̅c̅ iesus <am>i&bar;h&bar;c&bar;</am> <ex>iesus</ex> 0305
x̅p̅m̅ christum <am>x&bar;p&bar;m&bar;</am> <ex>christum</ex> 0305

Note that the combining bar above has been encoded more than once in these examples. That ensures an appropriate display of the manuscript text, since the bar will be shown as extending over the whole word. However, it may be argued that there is only a single bar in each example, and that this bar simply happens to extend over more than one character. This problem is discussed more fully in ch. 6.5.5 below.

6.5.2 Interlinear characters in other contexts

Interlinear (superscript) characters are used in various ways, not always as abbreviations. According to de Leeuw van Weenen 2000: 36-43 there are four types:

(a) as abbreviation

This type is discussed in ch. 6.4.7 above. Here, we recommend the usage of entities such as “&asup;”.

(b) as addition

When interlinear characters are used for adding characters which were left out by the scribe we recommend that this is encoded by use of the element <add> and the attribute @place="supralinear" (cf. ch. 7.2). There is no need for an entity of the type “&asup;” since the location of the character is indicated by the element.

Manuscript form Form in edition Encoding
hanͣ han⸌a⸍ han<add> place="supralinear">a</add>

(c) as complementation of Roman numbers

Inflected forms of Roman numbers are sometimes specified by interlinear characters. In these cases, the interlinear characters are not placed above any base line character but merely raised above the base line. We suggest using the element <seg> and the attribute @type="superscript".

Manuscript form Form in edition Encoding
v. v.⸌ti⸍ v.<add> place="supralinear">ti</add>

(d) as space savers

Especially at the end of a line one or more characters may be placed above the last word to save place and complete the line. We suggest the same encoding as in (c) above.

Manuscript form Expanded form Encoding
eᷤ e⸌s⸍ e<add> place="supralinear">s</add>

6.5.3 Missing abbreviation mark

From time to time one can find examples of a word that obviously is abbreviated but where there is no trace of the abbreviation mark. There is then no alternative but transcribing the text as it reads in the manuscript.

Manuscript form Expanded form Encoding
d(rottning) <am> d </am>

6.5.4 Nesting (stacking) of abbreviation marks

There are a few examples of base line characters which are abbreviated with an abbreviation mark which is itself abbreviated. An example is the base line character “m” with an interlinear “o” which in turn has a horizontal bar. According to rule 7 in ch. 2.2.1 above this abbreviation should be encoded as the sequence “m” + “&osup;” + “&bar;”.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion
mͦ̅ monnomia m<am>&osup;&bar;</am>& m<ex>onnom</ex>

Since “&osup;” is defined as a combining character, it follows that it is placed above the immediately preceding character, in this case “m”, and since “&bar;” is also defined as a combining character, it follows that it is placed above “&osup;”. There is therefore no doubt as to the positioning of each part.

6.5.5 Extension of abbreviation marks

As a rule, combining abbreviation marks are associated with a single base line character. Thus, the sequence “m&osup;” means that the interlinear character “o” is seen as being placed above “m” and not above any other character. However, some abbreviation marks extend over more than one character. For example, the word “kirkia” may be abbreviated with a horizontal bar crossing both the first and the second “k”. We believe it is sufficient to associate the abbreviation mark with only one of these characters, preferably the first.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion
k̅k̅ıa kirkia k<am>&bar;</am>k&inodot;a k<ex>ir</ex>kia

It is possible to encode this word so that the bar is associated with both characters. This is in a sense closer to the manuscript form, but it means that a single abbreviation mark may appear as two distinct marks (unless it is somehow stated that the two marks belong together). Thus, this is a more complex and possibly misleading solution.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion
k̅k̅ıa kirkia k<am>&bar;</am>k<am>&bar;</am>&inodot;a k<ex>ir</ex>kia

On the other hand, it should be noted that this a case where 0305 COMBINING OVERLINE is appropriate, since it connects to left and right. Cf. the reference in ch. 6.4.1 above.

6.5.6 Sporadic ligatures with abbreviation marks

In ch. 5.4 we recommended that sporadic ligatures should not be encoded by use of separate entities but by the element <seg> with the attribute @type="ligature". A sporadic ligature is basically a joining of two base line characters which together do not reflect a separate phonological value. This is the case with ligatures such as “s+k” and “p+p” which in this respect are identical to “s” + “k” and “p” + “p”.

Manuscript form Expanded form Encoding
(pp) <seg type="ligature"> pp </seg>

However, some ligatures are formed in such a manner that it is difficult to distinguish the separate parts. That applies to the ligature of long s + h, k and þ. In these cases, we suggest that it is advisable to use individual entities. These characters must be referred to the Private Use Area.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
hanſ <am>&hslonglig;</am> h<ex>an</ex>&slong; EBAD
konungſ <am>&kslonglig;</am> k<ex>onung</ex>&slong; EBAE
þeſſ <am>&</am> þ<ex>e&slong;</ex>&slong; E734

Often, a horizontal bar is used across these ligatures. The bar may be encoded separately with its usual entity, &bar; (cf. ch. 6.4.1 above) or with a character located in the Private Use Area.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
konungſ <am>&kslonglig;&bar;</am> k<ex>onung</ex>&slong; EBAE + 0305
konungſ <am>&kslongligbar;</am> k<ex>onung</ex>&slong; E7C8

6.5.7 The character “r” as interlinear ligature

A quite special type of abbreviation is interlinear “r” in ligature with e.g. “þ”. We suggest encoding this as a sporadic ligature of “þ” and interlinear “r”.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
þͬ þar <seg type="ligature">&thorn;&rsup;</seg> &thorn;<ex>ar</ex> 00FE + 036C

6.5.8 Sharp “s”

In late Old Norwegian, the “sharp s” appears in a number of abbreviations, e.g. for “skilling”, “smør” and “son”. The German character “sharp s” is defined in Unicode 5.0 as 00DF LATIN SMALL LETTER SHARP S in the range Latin-1 Supplement. We recommend uisng the ISO entity “&szlig;” also when this character is used as an abbreviation mark. The element <am> will indicate clearly that it is an abbreviation mark, not an ordinary character. See the discussion on the full stop in ch. 6.3.8 above.

Abbreviated form Expanded form Encoding of abbreviation Encoding of expansion Code point
Hakonß Hakonson Hakon<am>&szlig;</am> Hakon<ex>son</ex> 00DF

6.6 List of abbreviation marks

An extensive list of abbreviation characters is found in the MUFI character recommendation, cf. Appendix A below.