docling icon indicating copy to clipboard operation
docling copied to clipboard

Leverage word bbox from pdf-parser-v2 in the layout- and table-model

Open PeterStaar-IBM opened this issue 1 year ago • 6 comments

Requested feature

We have much finer grained bbox information using the docling-parse-v2, which could be easily leveraged by layout and table model for improved accuracy.

PeterStaar-IBM avatar Nov 09 '24 07:11 PeterStaar-IBM

Working on implementation, it will require some refactoring of page_preprocessing_model as well as table_structure_model.

maxmnemonic avatar Nov 12 '24 17:11 maxmnemonic

Working on implementation, it will require some refactoring of page_preprocessing_model as well as table_structure_model.

thank you very much

aborruso avatar Nov 12 '24 18:11 aborruso

@maxmnemonic you can leverage this new feature in docling-parse (https://github.com/DS4SD/docling-parse/pull/57)

PeterStaar-IBM avatar Nov 16 '24 07:11 PeterStaar-IBM

@PeterStaar-IBM , any update to https://github.com/DS4SD/docling-ibm-models for table-parsing to utilise this?

mllife avatar Nov 18 '24 05:11 mllife

It is coming ;)

PeterStaar-IBM avatar Nov 18 '24 06:11 PeterStaar-IBM

@PeterStaar-IBM , I am using your table model and integrated to my code based on the tests which uses iocr.parse format https://github.com/DS4SD/docling-ibm-models/blob/main/tests/test_data/samples/ADS.2007.page_123.png_iocr.parse_format.json ; what are changes required for me? It already had word based bboxes.

mllife avatar Nov 18 '24 07:11 mllife

solved here: https://github.com/DS4SD/docling-parse/pull/73

PeterStaar-IBM avatar Dec 10 '24 15:12 PeterStaar-IBM