docling icon indicating copy to clipboard operation
docling copied to clipboard

Update/retrain layout model to identify correctly single column reference pages

Open PeterStaar-IBM opened this issue 10 months ago • 1 comments

Bug

Currently, I see that the layout model makes sometimes tables out of references.

Steps to reproduce

example 1: https://arxiv.org/pdf/2106.09685

Image

Image

example 2: https://arxiv.org/pdf/2501.12948

Image

Image

Docling version

Docling version: 2.18.0
Docling Core version: 2.17.1
Docling IBM Models version: 3.3.0
Docling Parse version: 3.2.0
Python: cpython-312 (3.12.6)
Platform: macOS-15.3-arm64-arm-64bit

PeterStaar-IBM avatar Feb 07 '25 07:02 PeterStaar-IBM

see https://github.com/docling-project/docling-ibm-models/pull/92

cau-git avatar May 21 '25 12:05 cau-git