pdftoxml in utils.py is not portable to Windows.

Open StevenMaude opened this issue 11 years ago • 3 comments

May 16 '14 09:05 StevenMaude

Is this still true?

I am trying to convert pdf to xml on a Windows machine , Python 3 and I am getting an error on the "return xmldata.decode('utf-8')"

Please let me know.

Mar 16 '17 15:03 aparna06

This is still the case as no-one's changed the code there.

You could:

just run pdftohtml.exe separately and dump the results to a file (either via a script, or via Python subprocess or however you like)
or you can try using this as a starting point for replacing the code in this package.

Mar 16 '17 16:03 StevenMaude

Thank you. I shall try both.

Mar 16 '17 16:03 aparna06