Edouard Belval
Edouard Belval
Hi, You will have to use the asynchronous API. It is very similar to the synchronous API, except that the PDF file needs to be in S3 or you can...
I was not able to reproduce this issue with our internal samples, if you can share the Textract response or original asset necessary to reproduce this issue I can look...
This is something we could accept a PR for. I think it could be implemented as `extractor.get_status(job_id)` which returns a value from an enum defined in https://github.com/aws-samples/amazon-textract-textractor/blob/master/textractor/data/constants.py with `IN_PROGRESS`, `SUCCEEDED`,...
I will test it first but this looks like a known issue that happens when the LAYOUT predictions do not match the TABLE predictions, causing the reading order to be...
What version of `amazon-textract-textractor` are you using? With 1.8.2 I get: ``` Page 2 of 10 Schneider Electric South East Asia (HQ) Pte. Ltd. Schneider Electric Overseas Asia Pte Ltd...
Thank you for clarifying and sharing the file, I will attempt to reproduce the issue.
We have a fix for this issue that will be included into the 1.8.6 version. It should be available by March 7th.
Should be fixed in 1.9.0, let me know if that addresses your issue. The tables are not insert correctly in the output. Note that this will only fix the insertion...
I will leave the issue open until you can confirm that this is fixed.
Thank for the heads up. 1.9.0 should be in PyPI now. Note that it can take 1-2 hours for their cache to refresh. See: https://github.com/aws-samples/amazon-textract-textractor/actions/runs/13792246430