python-documentai-toolbox icon indicating copy to clipboard operation
python-documentai-toolbox copied to clipboard

DocumentAI: Issues with Multi-Page PDF

Open lpiscusc opened this issue 11 months ago • 0 comments

The provided conversion script (https://github.com/googleapis/python-documentai-toolbox/blob/d29ff95742269a95e1e96e047f0fa1268457292a/samples/snippets/convert_external_annotations_sample.py) seems to only support single-page documents. Uploading multi-page documents results in all bounding boxes appearing on the first page in the DocumentAI UI, despite correct page numbering in the annotations. We've tried modifying the config file without success.

Request: Information on how to handle multi-page documents with DocumentAI, ensuring bounding boxes appear on the correct corresponding pages.

Test Cases: Sample Annotations: https://github.com/googleapis/python-documentai-toolbox/blob/d29ff95742269a95e1e96e047f0fa1268457292a/tests/unit/resources/converters/test_type_1.json Config File: https://github.com/googleapis/python-documentai-toolbox/blob/d29ff95742269a95e1e96e047f0fa1268457292a/tests/unit/resources/converters/test_config_type_1.json

KR.

lpiscusc avatar Jan 22 '25 07:01 lpiscusc