amazon-textract-textractor icon indicating copy to clipboard operation
amazon-textract-textractor copied to clipboard

Visualizing words with search_words shows wrong results

Open Belval opened this issue 2 years ago • 1 comments

In the documentation, this example: https://aws-samples.github.io/amazon-textract-textractor/notebooks/visualizing_results.html#Visualizing-the-result-of-a-search does not generate the right output.

Expected: image

Result: image

This occurs when torch is not installed (but might occur when it is installed as well).

Belval avatar Apr 10 '23 17:04 Belval

the word/line similarity code is definitely buggy. Would be nice to pass in distance metrics, e.g. from the textdistance package: https://pypi.org/project/textdistance/

schorndorfer avatar May 08 '23 22:05 schorndorfer