amazon-textract-response-parser
amazon-textract-response-parser copied to clipboard
Parse JSON response of Amazon Textract
*Issue #: #73* *Description of changes:* add confidence to query answer By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the...
Implemented Custom Header Processor as argument of Table.get_header_field_names. New argument name is 'header_proc_func'. Processor Function gets headers as a list of lists as input argument.
Would it be a good idea to add [EntityTypes](https://github.com/aws-samples/amazon-textract-response-parser/blob/581ae2a8a9fa7f0d4c86ac11e4dc0b2a77246cf0/src-js/test/data/table-example-response.json#L2001) to [ApiCellBlock](https://github.com/aws-samples/amazon-textract-response-parser/blob/581ae2a8a9fa7f0d4c86ac11e4dc0b2a77246cf0/src-js/src/api-models/document.ts#L100), in [src-js](https://github.com/aws-samples/amazon-textract-response-parser/tree/581ae2a8a9fa7f0d4c86ac11e4dc0b2a77246cf0/src-js)?. `EntityTypes` for `BlockType : "CELL"` can be useful to find out `COLUMN_HEADERS` instead of assuming that `cellsAt(1,...
Bumps [terser](https://github.com/terser/terser) from 5.12.1 to 5.14.2. Changelog Sourced from terser's changelog. v5.14.2 Security fix for RegExps that should not be evaluated (regexp DDOS) Source maps improvements (#1211) Performance improvements in...
*Description of changes* Add several hashmaps (signature: `Dict[str, int]`) with the block ID as keys and the block index from `self.blocks` as values. As Textract identifies blocks by their ID,...
Not able to extract the merge cell text properly. There is some issue with combine headers function. Textract not able to extract the top header text properly. Reference: t_doc =...
*Description of changes: upgrade of version for marshmallow ======================== test session starts========================================= platform win32 -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0 rootdir: ... collected 41 items tests\test_base_trp2.py .. [ 4%] tests\test_trp.py ............
I would like to know the confidence level of query results, however this is not made available. I suggest to make a small change to get_query_answers in trp2.py: ``` def...
Hi I've been trying to extract only the text in reading order for multi-column cases with the code below. His problem is that the number of columns is manual. I've...
Hello, Currently I am performing OCR on 1 page document over there I am having multiple same name entity and in front of it there is a checkbox. I am...