page2tei icon indicating copy to clipboard operation
page2tei copied to clipboard

Tagging image metadata inside a facsimile element

Open HugoSchtr opened this issue 3 years ago • 1 comments

Image metadata is currently tagged within the <sourceDoc> element with <graphic>.

<sourceDoc>
      <graphic url="FRAN_0025_3056_L-0.jpg" width="2894px" height="4393px"/>
      <surfaceGrp>
         <surface xml:id="eSc_textblock_afbab800"
                  type="structure_{type:col_1;}"
                  points="421,615 421,2236 465,2211 465,2266 421,2269 425,2449 410,4148 362,4213 205,4228 234,615">
            <zone xml:id="eSc_line_86b00a8e"
                  type="mask"
                  points="285,838 293,812 322,798 380,801 377,863 289,874">
               <path type="baseline" points="289,841 389,845"/>
               <line>198</line>
            </zone>
            ...

Instead, and for the sake of clarity, image metadata can be tagged inside the <facsimile> element:

<facsimile>
      <graphic url="FRAN_0025_3056_L-0.jpg" width="2894px" height="4393px" xml:id="FRAN_0025_3056_L-0"/>
</facsimile>
<sourceDoc>
      <surfaceGrp facs="#FRAN_0025_3056_L-0">
         <surface xml:id="eSc_textblock_afbab800"
                  type="structure_{type:col_1;}"
                  points="421,615 421,2236 465,2211 465,2266 421,2269 425,2449 410,4148 362,4213 205,4228 234,615">
            <zone xml:id="eSc_line_86b00a8e"
                  type="mask"
                  points="285,838 293,812 322,798 380,801 377,863 289,874">
               <path type="baseline" points="289,841 389,845"/>
               <line>198</line>
            </zone>
            ...

Image metadata and transcription data would then be separated in their respective elements. With appropriate xml:id and facs attributes, multiple images could be encoded with a single TEI file.

HugoSchtr avatar Nov 09 '21 08:11 HugoSchtr