Leverage word bbox from pdf-parser-v2 in the layout- and table-model
Requested feature
We have much finer grained bbox information using the docling-parse-v2, which could be easily leveraged by layout and table model for improved accuracy.
Working on implementation, it will require some refactoring of page_preprocessing_model as well as table_structure_model.
Working on implementation, it will require some refactoring of page_preprocessing_model as well as table_structure_model.
thank you very much
@maxmnemonic you can leverage this new feature in docling-parse (https://github.com/DS4SD/docling-parse/pull/57)
@PeterStaar-IBM , any update to https://github.com/DS4SD/docling-ibm-models for table-parsing to utilise this?
It is coming ;)
@PeterStaar-IBM , I am using your table model and integrated to my code based on the tests which uses iocr.parse format https://github.com/DS4SD/docling-ibm-models/blob/main/tests/test_data/samples/ADS.2007.page_123.png_iocr.parse_format.json ; what are changes required for me? It already had word based bboxes.
solved here: https://github.com/DS4SD/docling-parse/pull/73