docling
docling copied to clipboard
Can't find a way to use `ImageRefMode.EMBEDDED` in `generate_multimodal_pages`
Question
I tried using the generate_multimodal_pages method from the official documentation example and attempted to apply it. I wanted to export content_md with ImageRefMode.EMBEDDED, but this appears to be a legacy version, and I can't find a way to use ImageRefMode.EMBEDDED.
def _process_page():
page_ix = page_no - 1
page = doc_result.pages[page_ix]
page_cells = _process_page_cells(page=page)
page_segments = _process_page_segments(doc_items=doc_items, page=page)
content_md = doc.export_to_markdown(main_text_start=start_ix, main_text_stop=end_ix)
# No page-tagging since we only process one page at a time
content_dt = doc.export_to_document_tokens(
main_text_start=start_ix, main_text_stop=end_ix, add_page_index=False
)
Do you have any suggestions for using this? It seems that the main issue is that the export_to_markdown method in DoclingDocument doesn’t support the parameters main_text_start and main_text_stop.
Thanks for the report, we will have a look and first update the examples.
I'm closing this issue in favour of a broader story in #835. New examples for multimodal exports and datasets creation are coming.