pdfquery issues

Can't concat str to bytes

3

When loading a pdf via `pdf.load()`, specifically the Pathfinder 2E playtest pdf, this error is returned: ```Traceback (most recent call last): File "", line 1, in File "C:\Users\Graham Drakeley\.virtualenvs\Downloads-jPJuT-kW\lib\site-packages\pdfquery\pdfquery.py", line...

gtdrakeley

NEXTPY-569 -- Make pdfquery compatible with Python 3.9 and 3.11

Update documentation, formatting, some imports. That's it :-)

kdleijer

can't concat str to bytes EASY FIX -- please update!

3

Hi! We discovered that the can't concat str to bytes error on load() can be fixed : you must fix a bug on line 271 of pdfquery version 0.4.3 by...

jstofel

Support for password protected pdf files

Currently there's no way to read a password protected pdf file using `pdfquery`. Only `parser` arg is passed to `QPDFDocument` which later inits the `PDFDocument` with same (no `password` arg...

nvkex

recommend you use pdfminer rather than pdfquery

1

There is a bug in pdfquery ( see previous issue report). We switched to pdfminer and reduced processing time from 20 min to 2 min.

jstofel

Python 2 dependency problem: pyquery

`pyquery` has dropped Python 2 support as of v1.4.3, but `pdfquery` has no upper bound on this dependency for its Python 2 build. See https://pypi.org/project/pyquery/ and view release notes for...

cyranix

Improve performance on large pdfs

1

Replacing `all(...` with simple ands cuts time to 1/3 of what it took before. Found this after profiling on a pdf that required ~32m of these comparisons

OpenGLShaders

Is this project still alive?

3

On pypi, I see that the project is in `Development Status :: 4 - Beta`, but the last commit to this repository is more than half a year ago. Is...

MartinThoma

Coordinates to locator

Is there a way by which I can pass coordinates of a particular text and get locator as output. For eg: input in_bbox : [22,278, 67, 289] output: LTTextLineHorizontal:contains("address")') maybe...

prashantgpt91

Not able to detect horizontal lines properly.

The xml file generated correctly detects the line when there is padding in the line item. However, if there's no padding, the code is unable to detect any line. Can...

cabudies

pdfquery
pdfquery copied to clipboard

Metadata

Can't concat str to bytes

NEXTPY-569 -- Make pdfquery compatible with Python 3.9 and 3.11

can't concat str to bytes EASY FIX -- please update!

Support for password protected pdf files

recommend you use pdfminer rather than pdfquery

Python 2 dependency problem: pyquery

Improve performance on large pdfs

Is this project still alive?

Coordinates to locator

Not able to detect horizontal lines properly.

← Metadata

Owner

Metadata

pdfquery pdfquery copied to clipboard

Metadata

← Metadata

Owner

Metadata

pdfquery
pdfquery copied to clipboard