page2tei
page2tei copied to clipboard
Tagging image metadata inside a facsimile element
Image metadata is currently tagged within the <sourceDoc>
element with <graphic>
.
<sourceDoc>
<graphic url="FRAN_0025_3056_L-0.jpg" width="2894px" height="4393px"/>
<surfaceGrp>
<surface xml:id="eSc_textblock_afbab800"
type="structure_{type:col_1;}"
points="421,615 421,2236 465,2211 465,2266 421,2269 425,2449 410,4148 362,4213 205,4228 234,615">
<zone xml:id="eSc_line_86b00a8e"
type="mask"
points="285,838 293,812 322,798 380,801 377,863 289,874">
<path type="baseline" points="289,841 389,845"/>
<line>198</line>
</zone>
...
Instead, and for the sake of clarity, image metadata can be tagged inside the <facsimile>
element:
<facsimile>
<graphic url="FRAN_0025_3056_L-0.jpg" width="2894px" height="4393px" xml:id="FRAN_0025_3056_L-0"/>
</facsimile>
<sourceDoc>
<surfaceGrp facs="#FRAN_0025_3056_L-0">
<surface xml:id="eSc_textblock_afbab800"
type="structure_{type:col_1;}"
points="421,615 421,2236 465,2211 465,2266 421,2269 425,2449 410,4148 362,4213 205,4228 234,615">
<zone xml:id="eSc_line_86b00a8e"
type="mask"
points="285,838 293,812 322,798 380,801 377,863 289,874">
<path type="baseline" points="289,841 389,845"/>
<line>198</line>
</zone>
...
Image metadata and transcription data would then be separated in their respective elements. With appropriate xml:id
and facs
attributes, multiple images could be encoded with a single TEI file.