docling icon indicating copy to clipboard operation
docling copied to clipboard

Result viewer application

Open pJahad opened this issue 1 year ago • 6 comments

Requested feature

web app to check docling parser result like pasted image.

スクリーンショット 2024-11-08 9 44 22

(taken from https://ds4sd.github.io/#use-cases )

pJahad avatar Nov 08 '24 00:11 pJahad

@pJahad This would indeed be nice to have, but we need to verify that we do not introduce too many dependencies.

I believe we could add a streamlit like app, but we need to engineer it carefully. In the meantime, I know that there is already a huggingface

--> https://huggingface.co/spaces/yasserrmd/DoclingConverter

PeterStaar-IBM avatar Nov 08 '24 05:11 PeterStaar-IBM

@pJahad I created an easily deployable API for this incase you decide on building a web app for it. You can check it out here docling-api

drmingler avatar Nov 09 '24 18:11 drmingler

@pJahad @drmingler note that we have a webserver for docling in the works. It is currently experimental stage. See here.

cau-git avatar Nov 11 '24 09:11 cau-git

What about a pure javascript HTML page without any server? Overlay may be provided by Mozilla's PDF.js

Upabjojr avatar Nov 22 '24 06:11 Upabjojr

What about a pure javascript HTML page without any server? Overlay may be provided by Mozilla's PDF.js

That would still require some backend service. Docling itself can't run in the browser.

cau-git avatar Nov 22 '24 07:11 cau-git

What about a pure javascript HTML page without any server? Overlay may be provided by Mozilla's PDF.js

That would still require some backend service. Docling itself can't run in the browser.

Not really, you can just open simple HTML files in your browser without a server if the HTML is not interacting with a server.

Upabjojr avatar Nov 22 '24 08:11 Upabjojr

https://gist.github.com/Upabjojr/97d0debbd67e9e3c81e57a4cea0d51a4

I wrote a simple HTML page that displays Docling's JSON output graphically.

Upabjojr avatar Jan 03 '25 09:01 Upabjojr

@Upabjojr That is great, but I hope you do know that you can export/save any document from the conversion result straight into HTML.

see here: https://github.com/DS4SD/docling/blob/main/docs/examples/export_figures.py#L83

PeterStaar-IBM avatar Jan 03 '25 09:01 PeterStaar-IBM

@Upabjojr That is great, but I hope you do know that you can export/save any document from the conversion result straight into HTML.

see here: https://github.com/DS4SD/docling/blob/main/docs/examples/export_figures.py#L83

Yes, I've noticed you've added this export feature to docling. I had already written this snippet for your older versions, so I just decided to share it.

Upabjojr avatar Jan 03 '25 10:01 Upabjojr

Maybe .export_to_html( ) should also have an option to choose if the HTML display option should follow the bounding boxes of the source document?

Upabjojr avatar Jan 22 '25 09:01 Upabjojr

Short update in the thread. We are finalizing an actual typescript/javascript SDK which will allow to operate directly with all the features of the DoclingDocument format.

dolfim-ibm avatar Jan 22 '25 11:01 dolfim-ibm

@dolfim-ibm Hi ! Will this be available as part of a next release of docling or in a separate repo/project ?

FloMrt avatar Mar 04 '25 15:03 FloMrt

@FloMrt It got finalized here: https://github.com/DS4SD/docling-ts. It is also used in the docling-serve (https://github.com/DS4SD/docling-serve)

PeterStaar-IBM avatar Mar 04 '25 16:03 PeterStaar-IBM

Perfect thank you @PeterStaar-IBM ! I will take a look

FloMrt avatar Mar 04 '25 20:03 FloMrt