pdfparser Python 3 - problem when file name is str, not bytes

Python 3 - problem when file name is str, not bytes

Open manish59 opened this issue 6 years ago • 6 comments

import pdfparser.poppler as pdf

pdf.Document(r"Manish.pdf") Traceback (most recent call last): File "", line 1, in File "pdfparser/poppler.pyx", line 116, in pdfparser.poppler.Document.cinit TypeError: expected bytes, str found

Its in redhat Can anyone help to fix this please.

Nov 01 '18 23:11 manish59

I'm assuming you're using python 3, right? - please confirm version. Currently file name is char*, which means bytes in python3 - so you should use pdf.Document(b"Manish.pdf")

In future versions we should improve interface to accept also strings in Python 3.

Nov 02 '18 06:11 izderadicka

Yes Im using in python. Its working. But the rgb values which Im getting are like this r:0.89 g:0.42, b:0.04 but when i check its not showing the same color i need. Do I need to do any thing here to get actucal rgb values in range of 0-255

Nov 02 '18 18:11 manish59

Actually I figured the color space. You can close this issue. Thanks for helping though. Can we detect any hyperlinks using this tool

Nov 02 '18 18:11 manish59

Can we use this library in windows ?

Nov 02 '18 19:11 manish59

See #17 - theoretically yes, but requires advanced win,python, c++ skills.

I'm keeping this open as there is potential improvement for python 3.

Nov 02 '18 19:11 izderadicka

can we detect hyperlinks using this tool

Nov 02 '18 21:11 manish59

pdfparser pdfparser copied to clipboard

Python 3 - problem when file name is str, not bytes

pdfparser
pdfparser copied to clipboard