Tobias Bruckert
Tobias Bruckert
@derricksimpson I run into exactly the same issue. It would be helpful to ad that to the documentation somewhere.
====================== test session starts ============================================== platform win32 -- Python 3.9.13, pytest-7.1.2, pluggy-1.0.0 rootdir: ... plugins: anyio-3.6.1 collected 41 items tests\test_base_trp2.py .. [ 4%] tests\test_trp.py ......... [ 26%] tests\test_trp2.py .......................... [...
oh didn't see that, we can maybe add a option with default to no in that function similar what was done in that function. https://github.com/aws-samples/amazon-textract-textractor/blob/d324b360dec724fc40bf46fe9f2441e8e403903f/prettyprinter/textractprettyprinter/t_pretty_print.py#L179 but yes also another object...
@schadem are you fine with the suggested solution if so I will update the PR.
@sravzmum are you able to provide a sample document. I agree with the option at one stage we could even extend it to except custom functions for processing.
@prasum Do you have a sample document you can share? Do you get the correct results from textract in the ocr step?
We would need some sort of example otherwise we cant help.
@jshipway is this issue resolved now?
Can you add an example of document in order to reproduce the issue.
Hey @wilianuhlmann did you try to use the following package to get the text sorted. https://github.com/aws-samples/amazon-textract-textractor/blob/d7c6488d6a707647171641958dfe8f05d6ffbc62/src/trp.py#L526