pandoc icon indicating copy to clipboard operation
pandoc copied to clipboard

Docx Reader: Support for aligned paragraphs

Open frederik opened this issue 1 year ago • 4 comments

The Microsoft Word feature to align text Left, Right and Both (justified) is frequently used feature in manuscripts I encounter. Currently, this information is not transferred to the Pandoc AST. I would appreciate it if we could encode this information somehow. The properties are encoded as such:

<w:pPr>
     <w:jc w:val="both" />
    <w:rPr>
        <w:lang w:val="en-US" />
     </w:rPr>
 </w:pPr>

aligns.docx

frederik avatar Feb 08 '24 18:02 frederik

This is the sort of presentational information that we normally don't preserve.

jgm avatar Feb 08 '24 18:02 jgm

If this is something you don't see in Pandoc in general or something that would have to be added through extensions by whoever needs it?

A little more background where this is coming from: A considerable number of the journals/monographs using our os-aps.de [MIT] suite (which makes use of Pandoc for DOCX import and some exports) have expressed a desire to keep this information all the way down to JATS XML Right now they are changing this by hand after the import before producing the XML. Pandoc has been great for this so far!

frederik avatar Feb 08 '24 20:02 frederik

Could the +styles docx extension be used for this purpose?

tarleb avatar Apr 25 '24 17:04 tarleb

I think this is probably just out of scope. Paragraph justification is a presentational detail.

jgm avatar Apr 27 '24 02:04 jgm