docling icon indicating copy to clipboard operation
docling copied to clipboard

Support export of DoclingDocument to HTML

Open cau-git opened this issue 1 year ago • 4 comments

Requested feature

Docling can currently export to JSON, Markdown and Doctags. Exporting to plain HTML would be a useful addition, because it renders nicely on any browser and correctly displays table structures with spans. This feature must be implemented in docling-core as a method DoclingDocument.export_to_html.

cau-git avatar Nov 11 '24 11:11 cau-git

Hi @cau-git, Please assign this issue to me, I'm interested in working on it.

taufikus avatar Nov 12 '24 07:11 taufikus

@taufikus Thanks for your interest in contributing to this issue! You are very welcome to create a proposal and submit a PR for our review.

Please note that since this issue is touching a core component of Docling, the code must strongly adhere to the contribution guidelines and needs decent test coverage. Hence I am summarizing a few recommendations below:

  • Please take the export_to_markdown method as a blueprint in terms of arguments and general code structure.
  • Place type hints on method signatures and inside the code
  • Make sure to install the pre-commit hooks (poetry run pre-commit install) before you commit, such that any commits are validated with the toolchain.
  • Add test units in docling_core here to specifically test the features of your HTML export method

I am assigning you to this issue, please let me know if you want to proceed.

cau-git avatar Nov 12 '24 09:11 cau-git

@taufikus Do you have an update on this?

PeterStaar-IBM avatar Nov 18 '24 08:11 PeterStaar-IBM

@PeterStaar-IBM sorry for the delaying this task, I've been working on the issue and have made little progress, but it's turning out to be more complex than initially anticipated. i guess i need more time on this. if someone else wants to take onto this then you can assign it to them as well. till then i will keep trying to do it and will try to come up with the solution.

taufikus avatar Nov 18 '24 13:11 taufikus