Pandoc produces 0 length PDF from docx
Explain the problem.
pandoc fails to convert a docx file to pdf. It outputs a empty PDF file from this particular pdf. File attached
new_resume_001.docx
Pandoc version? What version of pandoc are you using, on what OS? (If it's not the latest release, please try with the latest release before reporting the issue.)
OS: MacOS 13.5 (22G74)
❯ pandoc --version
pandoc 3.1.6
Features: +server +lua
Scripting engine: Lua 5.4
I have tried with basictex and mactex (15 March 2023 5.51 GB). Same result in both.
❯ pdflatex --version
pdfTeX 3.141592653-2.6-1.40.25 (TeX Live 2023)
kpathsea version 6.3.5
Copyright 2023 Han The Thanh (pdfTeX) et al.
There is NO warranty. Redistribution of this software is
covered by the terms of both the pdfTeX copyright and
the Lesser GNU General Public License.
For more information about these matters, see the file
named COPYING and the pdfTeX source.
Primary author of pdfTeX: Han The Thanh (pdfTeX) et al.
Compiled with libpng 1.6.39; using libpng 1.6.39
Compiled with zlib 1.2.13; using zlib 1.2.13
Compiled with xpdf version 4.04
I can confirm that pandoc parses this docx into an empty document. Will need to examine it more closely to see why.
I suspect this is actually just another manifestation of #3086.
In your docx, the textual content comes under <v:textbox> elements.
They are also under <mc:AlternativeContent> / <mc:Fallback>. so #5394 may also be relevant.
Sketch of the xml structure:
<w:p>
<w:r>
<mc:AlternateContent>
<mc:Choice Requires="wps">
<w:drawing>
<mc:Fallback>
<w:pict>
...
<v:textbox>
<w:txbxContent>
<w:p>
...textual content here...