amazon-textract-response-parser icon indicating copy to clipboard operation
amazon-textract-response-parser copied to clipboard

add confidence to list

Open tb102122 opened this issue 2 years ago • 5 comments

Issue #: #73

Description of changes: add confidence to query answer

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

tb102122 avatar Jul 11 '22 23:07 tb102122

That is a API breaking change. Should have added more initially, was a quick hack to get it out.... Maybe use the trp2.convert_queries_to_list_trp2 (https://github.com/aws-samples/amazon-textract-textractor/blob/d324b360dec724fc40bf46fe9f2441e8e403903f/prettyprinter/textractprettyprinter/t_pretty_print.py#L147)

Or we can add another method. Often customers want bounding box information as well, makes sense to output in different formats maybe.

schadem avatar Jul 12 '22 02:07 schadem

oh didn't see that, we can maybe add a option with default to no in that function similar what was done in that function. https://github.com/aws-samples/amazon-textract-textractor/blob/d324b360dec724fc40bf46fe9f2441e8e403903f/prettyprinter/textractprettyprinter/t_pretty_print.py#L179

but yes also another object would work well. @noxqs what's your take on it?

tb102122 avatar Jul 12 '22 03:07 tb102122

I think that could work also.. I depend on this feature as confidence levels are important. Its one of the two methods I have to offer feedback to my client re failing OCR. For now I monkey patch (sorry) get_query_answers in my projects, I understand this request is API breaking but to me the OCR result is only as good as the confidence level.

noxqs avatar Aug 05 '22 06:08 noxqs

@schadem are you fine with the suggested solution if so I will update the PR.

tb102122 avatar Aug 05 '22 09:08 tb102122

It should be a new method as it is an API breaking change and impact customers already using the existing method. @tb102122

schadem avatar Dec 02 '22 00:12 schadem