docling Leverage word bbox from pdf-parser-v2 in the layout- and table-model

Requested feature

We have much finer grained bbox information using the docling-parse-v2, which could be easily leveraged by layout and table model for improved accuracy.

Nov 09 '24 07:11 PeterStaar-IBM

Working on implementation, it will require some refactoring of page_preprocessing_model as well as table_structure_model.

Nov 12 '24 17:11 maxmnemonic

Working on implementation, it will require some refactoring of page_preprocessing_model as well as table_structure_model.

thank you very much

Nov 12 '24 18:11 aborruso

@maxmnemonic you can leverage this new feature in docling-parse (https://github.com/DS4SD/docling-parse/pull/57)

Nov 16 '24 07:11 PeterStaar-IBM

@PeterStaar-IBM , any update to https://github.com/DS4SD/docling-ibm-models for table-parsing to utilise this?

Nov 18 '24 05:11 mllife

It is coming ;)

Nov 18 '24 06:11 PeterStaar-IBM

@PeterStaar-IBM , I am using your table model and integrated to my code based on the tests which uses iocr.parse format https://github.com/DS4SD/docling-ibm-models/blob/main/tests/test_data/samples/ADS.2007.page_123.png_iocr.parse_format.json ; what are changes required for me? It already had word based bboxes.

Nov 18 '24 07:11 mllife

solved here: https://github.com/DS4SD/docling-parse/pull/73

Dec 10 '24 15:12 PeterStaar-IBM