pandoc icon indicating copy to clipboard operation
pandoc copied to clipboard

How to read figure caption from DOCX

Open jgm opened this issue 1 year ago • 0 comments

Discussed in https://github.com/jgm/pandoc/discussions/9390

Originally posted by rgaiacs January 30, 2024 Pandoc has support to Microsoft Word's native table caption. For example, when mwe-table.docx

Screenshot 2024-01-30 145920

is converted using pandoc --from docx --to html mwe-table.docx, it produces

<p>Lorem ipsum.</p>
<table>
<caption><p>Awesome table</p></caption>
<colgroup>
<col style="width: 50%" />
<col style="width: 50%" />
</colgroup>
<thead>
<tr class="header">
<th>A</th>
<th>B</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>1</td>
<td>2</td>
</tr>
</tbody>
</table>

Does Pandoc has support to read Microsoft Word's native figure caption? For example, when mwe-img.docx

Screenshot 2024-01-30 155653

is converted using pandoc --from docx --to html mwe-img.docx, it produces

<p>Lorem ipsum.</p>
<p><img src="media/image1.png"
style="width:1.10263in;height:1.10263in" /></p>
<p>Figure Blue square.</p>

instead of

<p>Lorem ipsum.</p>
<figure>
   <img src="media/image1.png"/>
   <figcaption>
      Blue square.
   </figcaption>
</figure>
```</div>

jgm avatar Jan 30 '24 17:01 jgm