Thomas Delteil issues

Results 24 issues of


                                            Thomas Delteil

Enhancement: Allow json parser to also set the images by passing the original document

Current work around for pdf is the following: ```python from pdf2image import convert_from_path from textractor.entities.document import Document # Loading the JSON response document = Document.open("output.json") # Loading the images and...

Improve AnalyzeExpense Support

Currently there is limited support for AnalyzeExpense in Textractor. We support sync and async API calls. However we need to implement the following: - [x] Allow duplication of KV for...

Modification to the Document entity from response is not captured when using .to_trp2()

The conversion to trp2 is based on using the initial response. This does not capture the any modifications made to the entities like OCR post-processing or correction or deletion of...

enhancement

Add pre-processing library to improve final results

It is often possible to improve results of the final processing by performing adjustements on the input image. We want to provide a helper library such that it is easy...

Rotated documents are not visualized correctly

I suggest that as a starting point we: - Use the polygon in order to visualize the text bounding box, print the text horizontally

bug

Improve Table visualization

We want to improve visualization and support of Tables features: - [ ] Header cells

Improve AnalyzeID support

In order to improve AnalyzeID support we want to: - [ ] Improve visualization to show-case the summary fields - [ ] Improve support of summary fields such that they...

Improve Table Indexing

There is some limited support for table indexing such as: ``` new_table = document.tables[0][:5, :] ``` In order to select the first 5 rows of a given table. However we...

Visualize document.expense_documents

It would be great if we could visualize expense_documents and the associated normalized summary fields directly on the document as well, similarly as to how we currently visualize KV containers...

enhancement

Visualization is not taking into account the Geometry block

See screenshot of parsing the screenshot of the readme. ![Screen Shot 2022-10-31 at 5 57 15 PM](https://user-images.githubusercontent.com/3716307/199136025-1c54d102-262b-4847-a8e6-4897c971a694.png) I believe this block `[TBlock(geometry=TGeometry(bounding_box=TBoundingBox(width=1.0, height=0.912468671798706, left=0.0, top=0.030051277950406075)` is ignored

bug