pdf2docx
pdf2docx copied to clipboard
All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters...still exist
Description of the bug
We have a pdf file which has some invalid characters, like \uffff. An error occurred during the conversion.
Please help.
How to reproduce the bug
\uffff in doc
pdf2docx version
0.5.8
Operating system
Windows
Python version
3.11