python-pdftables-api
python-pdftables-api copied to clipboard
API not working for files over 100KB
Greetings,
I am submitting a large set of files and only smaller files under 100KB are getting processed all others do not error out or provide any error message. I have adjusted the timeout parameter and this does not fix the issue.
Thanks!
@jmbanda: sorry for the delayed reply, you caught us over the holiday period.
Is this still an issue? If so, is it possible to provide us with the example code and PDFs to try and reproduce the error?
Greetings, yes, this continues to be an issue. I can't provide the PDF as it is private, but any PDF above 100KB was failing with the following code:
import pdftables_api
c = pdftables_api.Client('my-api-key', timeout=(60, 3600))
c.xlsx('input.pdf', 'output.xlsx')
Same happens with or without the timeout parameter. We still have plenty of pages left in our paid bundle, so that is not the issue. There is no error being thrown, it just skips the documents. If we input the document on the web UI manually, it works well.
Thanks; we'll add it to our issue queue and take a look, then report back (it may be a few days).
Just to follow up, I've tested the code here on a fresh Ubuntu 22.04 virtual machine and can't reproduce the issue. This was using Python 3.10 that came bundled with the operating system.
I did the following:
-
Created a virtualenv with
python3 -m venv api
-
Activated the virtualenv with
source api/bin/activate
to activate the virtualenv -
Ran
pip install git+https://github.com/pdftables/python-pdftables-api.git
to install the API code. -
Converted a test PDF named
input.pdf
of size 360 KB with the following code (edited to include my actual API key):import pdftables_api c = pdftables_api.Client('my-api-key', timeout=(60, 3600)) c.xlsx('input.pdf', 'output.xlsx')
This produced an output Excel file named output.xlsx
.
If you can give any more details about the environment in which the code was failing, we can try and reproduce further. It's tricky to fix without encountering the problem, unfortunately.