amazon-textract-textractor Enhancement: Allow json parser to also set the images by passing the original document

Enhancement: Allow json parser to also set the images by passing the original document

Open ThomasDelteil opened this issue 2 years ago • 0 comments

Current work around for pdf is the following:

from pdf2image import convert_from_path
from textractor.entities.document import Document

# Loading the JSON response 
document = Document.open("output.json")

# Loading the images and setting them on each page
images = convert_from_path('doc.pdf')
for page, image in zip(document.pages, images):
    page.image = image

Aug 09 '23 17:08 ThomasDelteil

amazon-textract-textractor amazon-textract-textractor copied to clipboard

Enhancement: Allow json parser to also set the images by passing the original document

amazon-textract-textractor
amazon-textract-textractor copied to clipboard