Problem areas
Manuscripts being what they are, it sometimes happens that parts of them can no longer be read at all or with the usual degree of confidence. The markup of such sections – whether they be long or short – is discussed here.
The relevant tags are:
Unclear text
Sometimes a letter, a word, a phrase or even a whole passage cannot be transcribed with certainty
because we cant work out what the element or elements in question are. The <unclear>
tag
marks up such text.
"Unclear" text in the corpus may be the product of one of three processes:
- the ink may have faded
- someone may have retraced text
- someone may have tried to amend the original
These three processes are encoded in the reason
attribute.
The following examples demonstrate the <unclear>
tag in use:
<w><unclear reason="faint">a</unclear>ual</w>
<w><unclear reason="retraced">a</unclear>ual</w>
<w><unclear reason="correction">a</unclear>ual</w>
Damage
Probably the most frustrating experience of a transcriber is when he has to admit defeat: so great is the degree of damage to his text that nothing whatsoever can be read. One of the reasons for such damage is that a hole has developed in the manuscript. But not all holes are the product of wear, tear and rodents. Some were present when the scribe began his task.
To our mind, the difference between a naturally occurring hole and one which developed after the text
was written is important enough to merit separate tags: there may be loss of information if damage
occurred after the text was written but there is no information loss if the damage was pre-existing.
We have, however, complied with the TEI guidelines and encoded both possibilities with the catch-all <damage
/>
tag.
The types of damage which we have encountered fall into three groups. These are encoded with the type
attribute:
natural hole
: typically found in a piece of skin through which passed one of the legs of its original owner; the text continues the other side of the holeunnatural hole
: due to a variety of reasons, e.g. water, mice or other pests; some text may have been lostcropped
: an unfortunate consequence of subjecting the manuscript to a binders knife; some text will have been lost
In the transcriptions, therefore, complete <damage />
tags have one of following three
formats:
<damage type="natural hole" />
<damage type="unnatural hole" />
<damage type="cropped" />
The TEI guidelines allow for describing various other attributes of a damaged area, such as its size
and who or what was responsible for it. Of particular relevance to our project was the size of a damaged
area, which could be defined with an extent
attribute. However, to have accurately measured
these spaces would not only have been costly and time-consuming, but would also have detracted us from
our primary task – to transcribe the texts. We therefore leave such matters to others.
Gaps
<gap />
indicates a point where weve been obliged to omit material in a transcription
because the text is illegible.
As with <damage />
, the TEI guidelines list several attributes which may be used in conjunction
with <gap />
. Again we decided that the most practicable course of action for the present
project was to forgo the details so that we could devote our main energies to transcribing.
Spaces
Sometimes a scribe has left an atypical amount of space in a text. This happens most often when we
have a gap which was clearly intended for a word or an ornate initial capital which was never added.
The presence of such significant space is encapsulated by <space />
.
Example:
<w><space />ri</w><w>gwr</w><w>doeth</w>
Editorially supplied text
<damage>
typically and <gap>
always encode parts of a manuscript which for
some reason are no longer there or have become illegible. But the existence of damage does not mean
that all is lost. Comparing a text with other related texts may enable us to be reasonably confident
that we know what has been lost. Text between <supplied>
signifies such text.
The reason
attribute indicates why the text has had to be supplied:
binding
: the text has been swallowed up by the bindingcropped
: the text has been cut, probably by a binders knifedamage
: the text has been damagedillegible
: the text cannot be readsic
: the text is missing because of scribal error
Two typical examples would be:
<w><supplied reason="illegible">y</supplied></w><w><supplied reason="binding">y</supplied></w>
Supplied text is typically the product of team consenus. We have not claimed such additions by putting
our names to them. Sometimes, however, supplied text is the product of a previous researcher. The provenance
of such additions is indicated by the resp
attribute. In the Peniarth 46 manuscript, for example,
the resp
attribute records that the text was supplied by J. Gwenogvryn Evans:
<w><supplied reason="illegible" resp="Gwenogvryn">y</supplied></w>