amazon-textract-textractor
amazon-textract-textractor copied to clipboard
Analyze documents with Amazon Textract and generate output in multiple formats.
I suggest that as a starting point we: - Use the polygon in order to visualize the text bounding box, print the text horizontally
We want to improve visualization and support of Tables features: - [ ] Header cells
In order to improve AnalyzeID support we want to: - [ ] Improve visualization to show-case the summary fields - [ ] Improve support of summary fields such that they...
When exporting the API output of Textract Tables, an error will be shown when opening the resulting `.xlsx` file in Microsoft Excel.
https://github.com/aws-samples/amazon-textract-textractor/issues/134 was merged and the underlying caller now support AnalyzeLending. We need to add it to Textractor.
There is some limited support for table indexing such as: ``` new_table = document.tables[0][:5, :] ``` In order to select the first 5 rows of a given table. However we...
Tests for prettyprinter call Textract directly instead of using JSON ``` def test_pretty_with_tables(): features = [Textract_Features.FORMS, Textract_Features.TABLES] textract_client = boto3.client('textract', region_name='us-east-2') response = call_textract(input_document="s3://amazon-textract-public-content/blogs/w2-example.png", features=features, boto3_textract_client=textract_client) assert response tables_result =...
Move the integration testing for the caller and textractor to a different account than 913165245630
Hi @Belval 1. I find out that every time when you create the [TGeoFinder](https://github.com/aws-samples/amazon-textract-textractor/blob/master/tpipelinegeofinder/textractgeofinder/tgeofinder.py#L51) class from the JSON data, you actually generate a uuid for this object and insert lot...
It would be great if we could visualize expense_documents and the associated normalized summary fields directly on the document as well, similarly as to how we currently visualize KV containers...