amazon-textract-textractor icon indicating copy to clipboard operation
amazon-textract-textractor copied to clipboard

Python Support for Column Headers

Open Belval opened this issue 1 year ago • 0 comments

Discussed in https://github.com/aws-samples/amazon-textract-textractor/discussions/350

Originally posted by samwhealon April 11, 2024 I have been playing around with this library and the original textract-response-parser. I found that TRP doesn't support returning table_titles. So I looked at textractor, which supports titles but not column_headers.

Are there plans to add column headers to be created with the table object?

Column headers have a setter and return property under the table entity, however the column_headers setter doesn't seem to be accessed in the response parser.

Where I think this would exist in the parser: https://github.com/aws-samples/amazon-textract-textractor/blob/8cf3759adc4a9eee56f5ae1d15e778aa70bf88ca/textractor/parsers/response_parser.py#L1013

Functions for table exist here: https://github.com/aws-samples/amazon-textract-textractor/blob/8cf3759adc4a9eee56f5ae1d15e778aa70bf88ca/textractor/entities/table.py#L152

Screenshot of my debugger. Column_headers register as true in the table_cells, but not populating in the column_headers field

image

Belval avatar Apr 11 '24 21:04 Belval