amazon-textract-textractor icon indicating copy to clipboard operation
amazon-textract-textractor copied to clipboard

Support integrity in text spacing with prettyprint

Open rasrivid opened this issue 2 years ago • 0 comments

Image shows multi-column text for which the Textract returns words with bounding box information. Screenshot 2023-02-15 at 11 37 41 AM

Aim: Support export/pretty print retaining the spaces shown in the document i.e print digital text in multi-column format. Example: image

Conversion of text to the following format:

1 First chapter                                   3
1.1 Section One                                 3
1.2 Section Two                                   3
1.3 Section Three                                  3

2 Last chapter                                     5
2.1 Section One                                 5
22 Section Two                                   5
2.3 Section Three                                  5

rasrivid avatar Feb 15 '23 19:02 rasrivid