Fabric icon indicating copy to clipboard operation
Fabric copied to clipboard

[Bug]: Web UI uses poorly maintained pdf-to-markdown javascript library

Open ksylvan opened this issue 6 months ago • 2 comments

What happened?

The latest pdfjs (version 5.x) does not render correctly with the way the current Svelte app uses pdf-to-markdown

I managed to upgrade the ancient version of that (with its correspondingly old version of the pdfjs-dist library) to a more recent 4.x version.

The current maintainer of pdf-to-markdown says here: https://github.com/jzillmann/pdf-to-markdown/issues/10#issuecomment-1413796796

[...] the modularize branch isn't working yet (and I'm currently not active). So no way using it. Please checkout https://www.npmjs.com/package/@opendocsg/pdf2md

This is a much more active and current Javascript pdf2md project.

We should replace the pdf-to-markdown usage with this library.

CC: @jmd1010 @eugeis @danielmiessler

Version check

  • [x] Yes I was.

Relevant log output


Relevant screenshots (optional)

No response

ksylvan avatar May 25 '25 18:05 ksylvan

Hey Kayvan, Have you tinkered with pdf2md and get it to work in fabric web? I remembered testing a few of them and the jzillmann was the only one that kind of worked without investing too much time.

Let me know if you have success. I'm just swapped these days with other commitments but we can take a kick at this can together or get others to help on this one, anyone?

jmd1010 avatar May 26 '25 15:05 jmd1010

Thanks Jean. I'll take a crack at it.

Like I said I managed to update to pdfjs 4.X but dependabot keeps wanting us to go up pdfjs 5.X

ksylvan avatar May 26 '25 15:05 ksylvan