Ch. 8. Fragmentation, illegibility and uncertainty
Version 3.0 beta
This is a preliminary version which can be changed or updated at any time.
The chapter has been written by Odd Einar Haugen.
Quite a few Medieval Nordic manuscripts have been preserved in their entirety. The recommendations in ch. 3 cover most aspects of the document structure in these manuscripts. However, many manuscripts, perhaps the majority of the earlier manuscripts, have been preserved in a less than complete state. This chapter will discuss how to deal with various degrees of fragmentation, ranging from minor holes in a leaf to tiny fragments of once large codices. It also deals with text that can be identified as such in the mansucript but which have an unclear reading.
In some of these cases, the encoder might want to supply the missing, illegible or
semilegible text, either from other manuscripts or by conjecture, using the
<supplied> element. This type of editorial intervention will be discussed
in the next chapter.
The focus in this chapter is on fragmentation which entails loss of text. In some manuscripts, there are holes or fissures which must have been original, such as in Fig. 8.1. Here, the text simply continues on the other side of the hole, so from a textual point of view, it is not necessary to encode this fact. The same applies to leaves which are not fully sized, e.g. by being shorter by width or height than other leaves in the manuscript.
When the manuscript contains a lacuna we suggest the use of the element
<gap/> , whatever the extent of the lacuna. In some cases, the lacuna may
be nothing more than a small hole in a leaf of the manuscript, while in other cases,
the lacuna may consist of a whole leaf, a whole quire or even several quires.
This element has several facultative attributes, specifying reason, responsibility and size:
||Is an element without extention in the encoded manuscript text. It indicates a point where material has been omitted in a transcription either because it is physically missing in the manuscript or because the manuscript text is illegible. Attributes include:|
||Gives the reason for omission. Sample values include: 'damage’, 'missing', 'removed'.|
||Indicates the transcriber, encoder or editor responsible for the decision
not to provide any transcription and hence the application of the
||Indicates approximately how much text has been omitted from the transcription. Values can be given as e.g. number of characters (‘chars.’), number of lines or number of leaves in the manuscript.|
||Names the unit used for describing the extent of the gap.|
||In the case of text omitted because of damage, categorizes the cause of the damage, if it can be identified.|
In Fig. 8.2 there are several small lacunas, and unlike the hole in Fig. 8.1, they are not original. A few words or parts of words in these lines have evidently been lost.
The encoding of these lines would be as follows (using a simplified encoding of the
<pb n="10"/> . . . <w>mis <lb n="11"/> con<ex>n</ex>a[..]</w> <gap reason="damage" quantity="4" unit="chars"/> <w>[..]ar</w> <w>þa</w> <w>alla</w> <w>er</w> <w>a</w> <w>h<ex>ann</ex></w> <w>trva</w> <pc>.</pc> <w><ex>ok</ex></w> <w>mon</w> <w>þ<ex>at</ex></w> <w>mis <lb n="11"/> <w>con<ex>n</ex>ar</w> <gap reason="damage" quantity="3" unit="chars"/> <w>[..]ior</w> <w>þei>ex>m</ex></w> <w>er</w> <w>endrb<ex>er</ex>asc</w> <w>af</w> <w>vat<ex>t</ex>ni</w> <w><ex>ok</ex></w> <gap reason="damage" quantity="2" unit="chars"/> <w>[..]l <lb n="12"/> gom</w> . . .
In this case, the missing characters can be reconstructed on the basis of other
manuscripts containing the text. Lost text can be inserted using the
element described in ch. 9 below.
In ch. 8.6 below, examples of larger lacunas are given.
8.3 Empty space in the manuscript
<space/> element is used to represent deliberate omissions from the
manuscript which have some significance, e.g. spaces left for decorated initials or
words. This element has several facultative attributes, specifying the size and dimension of the space:
||Is an element without extention in the encoded manuscript text. It indicates a point in a transcription of a manuscript where the mansucript has a deliberate omission. Attributes include:|
||The extent of the space. Values can be given as e.g. number of characters, number of lines or number of leaves in the manuscript.|
||Names the unit used for describing the extent of the gap.|
||Indicates the dimension of the space, i.e. whether it is horizontal or vertical. For irregular shapes in two dimensions, the value for this attribute should reflect the more important of the two dimensions. In conventional left-right scripts, a space with both vertical and horizontal components should be classed as vertical.|
In Fig. 8.3, the scribe has left space for the amount of silver to
be paid, so that it could be supplied at a later stage. Perhaps his exemplars were
conflicting here and he wanted to add the specific amount after having compared
sources. From other manuscripts of this text,
Magnus Lagabøtes landslov), we know that the
amount was half a mark (mǫrk f.).
The space left for the amount of silver in Fig. 8.3 can be encoded as follows:
<pb n="42v"/> . . . <lb n="18/>" . . . <w>en</w> <w>nu</w> <w>e>ex>r</ex></w> <w>skílt</w> <pc>.</pc> <w>giallde</w> <pc>.</pc> <space quantity="3" unit="chars" dim="horizontal"/> <w>S<ex>ilfrs</ex></w> <pc>.</pc> <w>Sv</w> <w>e>ex>r</ex></w> <w>on<ex>n</ex>ur</w> . . .
In another example, Fig. 8.4, the missing initial is evidently the character “Þ” in the word “Þa”.
The space has a size of approx. four characters horizontally and two lines vertically. According to the TEI P5 Guidelines, a space with both vertical and horizontal components should be classified as vertical. So, in this case, we measure the space along the vertical dimension and the natural measure is the number of lines. The missing initial in Fig. 8.4 can be encoded as follows:
<pb n="7r"/> . . . <lb n="28"/> <space quantity="2" unit="lines" dim="vertical"/> <w>[..]a</w> <w>m<ex>æ</ex>l<ex>t</ex>i</w> <w>G<ex>angleri</ex></w> <w>hu<ex>er</ex>r</w> <w>er</w> <w>leið</w> <w>t<ex>il</ex></w> <w>himins</w> <w>af</w> <w>iorðu</w> . . .
In both cases, and especially the second, the transcriber would add the
missing character(s) using the
<supplied> element. See ch. 9 for examples of this.
8.4 Unclear passages
Many manuscripts contain text that is difficult to read, sometimes downright illegible, whether it is a single character, a word, a phrase or even longer passages. We recommend using the
<unclear> element in such cases, always specifying the degree of unclearness with the attribute
@rend. For a completely illegible passage, we recommend the value 'illegible', and 'semilegible' for a passage which is partly legible. More than two degrees of legibility are possible, but are probably not operational. Obviously, the value 'legible'
does not make sense in a passage which is deemed to be
<unclear> . TEI does not offer attributes like
@degree for the
<unclear> element, so we have opted for the
@rend attribute here.
The reason for the unclearness may be described by the attribute
@reason and values such as 'faded', 'smudged' and 'erased'. The transcriber may also specify the one who is responsible for the reading in the
@resp attribute, especially if this is someone else than the transcriber (e.g. an earlier edition of the text). In cases where the physical reason for the unclearness can be identifed, this may be set out in the
8.4.1 Encoding of unclear passages
||Contains a letter, word, phrase or passage which cannot be transcribed with certainty. Attributes include:|
||Indicates how the passage was rendered in the source, specifically to which degree it is legible. We recommend using one of the two values 'illegible' and 'semilegible'.|
||Indicates why the material is hard to transcribe. Sample values include: 'faded', 'smudged' and 'erased'.|
||Indicates the individual responsible for the transcription of the letter,
word, phrase or passage contained within the
||Where the difficulty in transcription arises from an identifiable cause, signifies the causative agent. Sample values include: 'rubbing', 'mildew' and 'smoke'.|
In Fig. 8.5, the 6th word in the 3rd line is difficult to read. It may be the verb veita ‘give’, but it should be stated by the transcriber that this is indeed an unclear reading.
The encoding of this line would be as follows (using a simplified encoding of the
<pb n="2r"/> . . . <lb n="3"/> <w>postola</w> <w>sinum</w> <pc>.</pc> <w>Ðat</w> <w>ma</w> <w>han</w> <unclear rend="semilegible" reason="smudged"><w>veita</w></unclear> <w>mer</w>
The fissure in lines 4–5 must be original, so that e.g. the word “fiolda” in line 5 is written with “fiol” on the left-hand side and “da” on the right-hand side of the fissure. This is another example of original damage to the parchment discussed in ch. 8.1 above.
In Fig. 8.6, the first character in the second line is so smudged that it has become illegible. From the context, one can assume that it is the Tironean nota for “ok”, and that it thus is a whole word which is illegible. It is practical to render the illegible character by a standard sign, and we suggest using U+25CC DOTTED CIRCLE for this purpose. If it is just a handful of illegible characters, we recommend using as many dotted circles as there probably were characters in the manuscript. For longer passages, we suggest an opening series of three dotted circles, a number of full stops, and a closing series of dotted circles, e.g. ◌◌◌...◌◌◌.
<pb n="51v"/> . . . <lb n="32"/> <unclear rend="illegible" reason="smudged"><w>◌</w><unclear> <w>m<ex>æ</ex>la</w> <w>s<ex>va</ex></w> <pc>.</pc> <w>lios</w> <w>þ<ex>ett</ex>a</w> <w>mon</w> <w>scina</w>
See ch. 9 for an example of text supplied by the editor using the
Display of unclear passages
In general, the Menota stylesheet uses grey characters for text that is not in the actual manuscript, such as an explanatory note by the editor. By analogy, grey is also used for semilegible and illegible characters. In keeping with traditional printed editions, which typically use subpunction for each unclear character, the stylesheet adds a dotted underline for semilegible characters, in addition to the grey colouring. Illegible characters, which should be encoded using the dotted circle (U+25CC), will be displayed in the same degree of grey as semilegible characters.
Stylesheets differ somewhat with respect to the display of unclear passages. This is the present display in the Menota archive:
|Elements and attributes||Display in a single-level transcription||Display in a multi-level transcription|
||Characters are rendered in grey with a dotted, grey underline.||Characters are rendered in grey with a dotted, grey underline on the
||The dotted circle is rendered in grey.||The dotted circle is rendered in grey on the
8.5 Document structure of fragments
A fragment can be defined as anything less than 50 % of a once complete manuscript, but it is commonly used for much shorter parts of a manuscript, perhaps only a few leaves, a single leaf, or even a small bit of a leaf. Fragments can be quite challenging with resepct to the encoding of their document structure, even if the text as such is perfectly readable. Below, we will be looking at four examples, focusing on how to deal with their document structure and how to refer to the missing text which surrounds them.
8.5.1 Several leaves
NRA 7 is the largest fragment of
Landslǫg Magnúss Hákonarsonar
Magnus Lagabøtes landslov), containing seven leaves from
various parts of the law. One of the leaves is in fact preserved as three
individual pieces. The encoding of this leaf will be discussed in ch. 8.5.4
The sequnece of the leaves making up NRA 7 can be ascertained with a full degree of certainty, since this law texts are preserved in around 40 manuscripts. The should therefore be foliated in the sequence they have in the text, from fol. 1 to fol. 7. Of these leaves, fols. 3–4 and 6–7 are consecutive, but between the remaining leaves, there is a number of lost leaves.
When the loss of leaves are so high as in this case, it may be difficult to
specify the amount of missing leaves between each preserved one. It might,
however, be possible to establish the sequence using a series of
elements, showing where there is continuity and where there is not:
<gap quantity="several leaves from the beginning of the manuscript"/> <pb n="1r"/> . . . <pb n="1v"/> . . . <gap quantity="several leaves"/> <pb n="2r"/> . . . <pb n="2v"/> . . . <gap quantity="several leaves"/> <pb n="3r"/> . . . <pb n="3v"/> . . . <pb n="4r"/> . . . <pb n="4v"/> . . . <gap quantity="several leaves"/> <pb n="5r"/> . . . <pb n="5v"/> . . . <gap quantity="several leaves"/> <pb n="6r"/> . . . <pb n="6v"/> . . . <pb n="7r"/> . . . <pb n="7v"/> . . . <gap quantity="several leaves until end of manuscript"/>
From this encoding, we understand that the beginning and the end of the once complete codex is missing, and that there are gaps in between all leaves apart from between fols. 3 and 4, and between fols. 6 and 7.
8.5.2 A pair of leaves
Quires were made up of pairs of leaves, bifolia, and it is not uncommon that one or more bifolia are missing from the quire, often the inner or outer bifolium. Unless the bifolium belongs to the inner part of a quire, the text will not be consecutive.
The type of fragmentation can be illustrated with quire IX of AM 619 4to. According to the standard foliation of the manuscript, there is an outer bifolium missing between the present folios 62 and 63, and 68 and 69, as shown in ill. 8.7. There will thus be a lacuna of two leaves or four pages in the text. Note that the foliation of the manuscript is concecutive, so there is nothing in these numbers that indicate that there is a gap between fol. 62 and 63, and between 68 and 69.
The missing leaves are not immediately visible in the codex, which has been bound at a later stage. However looking at the bottom of fol. 62, it is stated by a younger hand that a leaf is missing. By reading fol. 62 and 63, it is also clear that a piece of text is missing, but not immediately how much.
As recommended above, the missing text should be indicated by the
element, so that the first lacuna, between fols. 62 and 63, will be encoded like
<w>bøt</w> <w>með</w> <w>scripta</w> <gap quantity="1" unit="leaf"/> <pb n="63r"/> <lb n="1"/> <w>er</w> <w>hinn</w> <w>helgi</w>
While this encoding is sufficient, we strongly recommend that the transcriber adds an explanatory note, so that users will have some guidance as to what is missing and to what extent it can be supplied from e.g. other manuscripts. This note should be consise and using non-technical language, for example:
<w>bøt</w> <w>með</w> <w>scripta</w> <gap quantity="1" unit="leaf"/> <note type="explanatory">A leaf is missing between fols. 62v and 63r. The text can be supplied from another manuscript, Upps DG 8 II, fol. 106.18–108.20</note> <pb n="63r"/> <lb n="1"/> <pb n="63r"/> <lb n="1"/> <w>er</w> <w>hinn</w> <w>helgi</w>
8.5.3 A single leaf
There are several early Norwegian fragments of
Speculum regale). One of these is NKS 235 g 4to, a single
fragment in two columns, each of 27 lines, as illustrated in Fig. 8.9
The document structure of NKS 235 g 4to is simple. Since it is the only leaf preserved from presumably a once complete codex, there is no alternative but to call it fol. 1. On either side of this leaf, there will be a considerable lacuna, the first from the beginning of the codex, the second until the end (unless the manuscript never was completed):
<gap quantity="many leaves from the beginning of the manuscript"/> <pb n="1r"/> <cb n="A"/> <lb n="1"/> . . . <cb n="B"/> <lb n="1"/> . . . <lb n="27"/> <pb n="1v"/> <cb n="A"/> <lb n="1"/> . . . <cb n="B"/> <lb n="1"/> . . . <lb n="27"/> <gap quantity="many leaves until end of manuscript"/>
While the initial and final
<gap/> encoding is correct enough, it is not
strictly necessary. Here, again, an explanatory note would be more useful (see ch.
8.5.4 One or more pieces of a leaf
The smallest fragments are smaller strips or cuttings of leaves. One such fragment may be all that is left of a complete codex. In some cases, more than one fragment can be pieced together as part of a single leaf, and sometimes several fragmented leaves can be brought together such as in the case of NRA 7 above. Here, the three fragments shown in Fig. 8.10 can be shown to belong to a single page, so that when putting them together, a substantial part of of the leaf can be read.
As for the document structure, we suggest that all three fragments are encoded as the same leaf, in this case fols. 2r and 2v, and that they receive line numbers to the extent that this can be ascertained. Since several other leaves of this once complete manuscript has been preserved, one can be fairly certain about the line numbering, as well as the position of the pieces in each of the two columns.
Taking the single piece from column A on fol. 2v as an example, the document structure would thus be:
<pb n="2v"/> <cb n="A"/> <lb n="11"/>NU ma maðr bøta <lb n="12"/>rað sunar sínns. oc læiða . . . <lb n="18"/>tækr ætt leíðíngr þo eígí merí arf <lb n="19"/>en sa stoð till er arfi iattaði. Sa skal
The whole leaf 2v would then be encoded with the
<gap/> element like
<pb n="2v"/> <cb n="A"/> <gap quantity="10" unit="lines"/> <lb n="11"/>NU ma maðr bøta . . . <lb n="19"/>en sa stoð till er arfi iattaði. Sa skal <gap quantity="8" unit="lines"/> <cb n="B1"/> <lb n="1"/>Sua skall kono ættleiða sem karll- . . . <lb n="8"/>þat fe allt hafa sem hann er till <cb n="B2"/> <lb n="9"/>læídðr meðan þæír lifa er ættleíðu . . . <lb n="19"/>fyrír í þæírí villu. at hann <gap quantity="8" unit="lines"/>