Christoph Auer
Christoph Auer
@naufalso Thanks for your input. Please see the following two discussions to understand our take on this: - https://github.com/DS4SD/docling/discussions/377 - https://github.com/DS4SD/docling/discussions/306
@Navanit-git From what I can see, your proposed change would not effect too much: 1. If you use HF datasets `snapshot_download`, it will only re-transfer the actual assets if the...
@Navanit-git We haven't seen an update in two weeks, hence I will close this PR. Feel free to re-open it if you want to follow up again. Thanks!
Closing this issue, since it is tracked in #913 and new Markdown serializers in docling-core address the other aspects.
@simonschoe We have that _partially_ established through: - `PictureItem.export_to_markdown` - `TableItem.export_to_markdown` We could extend it to other item types, but most of the others need no special treatment, since they...
@SimJeg We will implement a design as proposed here: https://github.com/DS4SD/docling/discussions/894 Then, this work will be able to make use of it.
We must agree on the naming, instead of `UNSUPPORTED`. ``` DROPPED DISCARDED SKIPPED # favourite ``` and add a reason that explains properly what went wrong as an `ErrorItem` in...
@mawi12345 I agree that the reading order is ill-defined in Powerpoint since it is not necessarily the order in which elements were inserted or created. However, sorting it from top...
Closing this, moved back to issue https://github.com/docling-project/docling/issues/2097
This seems to be non-reproducible on any other environments than a Windows 11 ARM VM on UTM. Closing.