Christoph Auer

Results 170 comments of Christoph Auer

@rateixei I see that the CI tests are failing, can you please re-generate the test GT? We need to see if it passes after that. Also, please rebase from `main`.

Note: We will hold off with merging this until the design proposal for inline styles is implemented: https://github.com/DS4SD/docling/discussions/894

@Nowheresly Thanks for providing a sample for this edge case. We are actively working on this topic, stay tuned for future updates.

After checking closer, @JeandeBalzac your issue does not appear to be connected to portrait layout. It is simply because there are many elements identified as figures, and these will export...

@JeandeBalzac if you have more affected PDFs please attach them here, we need to analyze this problem more broadly.

@benzhang-se The core problem is representing and extracting picture contents. We are actively working on creating datasets and models for this purpose. Once available it will be announced in the...

I am closing this issue, since we decided that "approximate" pagination in Word is not feasible to include.

TODO - [x] Put DoclingParseV1DocumentBackend back, mark as deprecated - [x] Correct handling of `BoundingRectangle.to_bounding_box()` when text cells are rotated, instead of fixing it in `get_text_cells`. - [x] Add pipeline...

@samhita-alla I am reproducing this with Docling 2.17.0 and confirm there is most content detected as picture only. It will need some deeper analysis on the layout model.