Thomas Delteil

Results 24 issues of Thomas Delteil

Current work around for pdf is the following: ```python from pdf2image import convert_from_path from textractor.entities.document import Document # Loading the JSON response document = Document.open("output.json") # Loading the images and...

Currently there is limited support for AnalyzeExpense in Textractor. We support sync and async API calls. However we need to implement the following: - [x] Allow duplication of KV for...

The conversion to trp2 is based on using the initial response. This does not capture the any modifications made to the entities like OCR post-processing or correction or deletion of...

enhancement

It is often possible to improve results of the final processing by performing adjustements on the input image. We want to provide a helper library such that it is easy...

I suggest that as a starting point we: - Use the polygon in order to visualize the text bounding box, print the text horizontally

bug

We want to improve visualization and support of Tables features: - [ ] Header cells

In order to improve AnalyzeID support we want to: - [ ] Improve visualization to show-case the summary fields - [ ] Improve support of summary fields such that they...

There is some limited support for table indexing such as: ``` new_table = document.tables[0][:5, :] ``` In order to select the first 5 rows of a given table. However we...

It would be great if we could visualize expense_documents and the associated normalized summary fields directly on the document as well, similarly as to how we currently visualize KV containers...

enhancement

See screenshot of parsing the screenshot of the readme. ![Screen Shot 2022-10-31 at 5 57 15 PM](https://user-images.githubusercontent.com/3716307/199136025-1c54d102-262b-4847-a8e6-4897c971a694.png) I believe this block `[TBlock(geometry=TGeometry(bounding_box=TBoundingBox(width=1.0, height=0.912468671798706, left=0.0, top=0.030051277950406075)` is ignored

bug