pdf-to-markdown icon indicating copy to clipboard operation
pdf-to-markdown copied to clipboard

is there any option to convert pdf to mark down with embedded images

Open jayanh opened this issue 8 years ago • 5 comments

Hi! This version can only convert to text, so is there any way/option to convert with media(images...)

Thanks

jayanh avatar Aug 17 '17 09:08 jayanh

Currently not, sorry. I've thought about it... think pdf.js allows to extract media... but haven't tried it and for my use it was irrelevant...

jzillmann avatar Aug 23 '17 07:08 jzillmann

Table data would be great too

marky-mark avatar Mar 16 '20 15:03 marky-mark

@jzillmann Would you accept a bounty for this feature ?

berserkwarwolf avatar Sep 15 '20 18:09 berserkwarwolf

@berserkwarwolf What exactly ?

  1. You want media extracted ?
  2. You want media extracted and included in the markdown as links (probably downloaded as a folder) ?
  3. Table data ?

jzillmann avatar Sep 21 '20 19:09 jzillmann

https://pdfbox.apache.org/

REM for %f in (*.pdf) do extract "%f"
java -jar pdfbox-app-2.0.24.jar ExtractImages %1

flywire avatar Aug 22 '21 09:08 flywire