pdf2htmlEX icon indicating copy to clipboard operation
pdf2htmlEX copied to clipboard

Create a new latest docker image on docker hub

Open universeroc opened this issue 1 year ago • 8 comments

The latest tag is : 0.18.8.rc2-master-20200820-alpine-3.12.0-x86_64

I can't use docker pull pdf2htmlex/pdf2htmlex to get the image unless docker pull pdf2htmlex/pdf2htmlex:0.18.8.rc2-master-20200820-alpine-3.12.0-x86_64

The image is created at 3 years ago, it's too old.

Please create a new docker image based on the latest version of code. Thanks in advance.

If I can help I would like to do something :)

https://hub.docker.com/r/pdf2htmlex/pdf2htmlex/

universeroc avatar Oct 09 '23 06:10 universeroc

Good issue, trying to use the existing image got the following:

docker pull pdf2htmlex/pdf2htmlex
Using default tag: latest
Error response from daemon: manifest for pdf2htmlex/pdf2htmlex:latest not found: manifest unknown: manifest unknown

konradzdeb avatar Oct 09 '23 21:10 konradzdeb

Is this project no longer being updated?

ayoporridge avatar Oct 24 '23 00:10 ayoporridge

Docker is outdated… Homebrew is outdated…

Is this project abandoned or what?

remino avatar Nov 15 '23 05:11 remino

For the meantime, note as Docker will try to fetch the latest tag of an image by default, and there is no latest tag on Docker Hub, the command in the wiki will fail. Instead, I explicitly fetched the last tag:

docker run -ti --rm -v ~/pdf:/pdf -w /pdf pdf2htmlex/pdf2htmlex:0.18.8.rc2-master-20200820-alpine-3.12.0-x86_64 --zoom 1.3 input.pdf

It worked well and did a decent job. But sadly on my end, I still need to do some troubleshooting, because all the text is invisible from the 4th slide thereon in the PDF file I converted to HTML.

remino avatar Nov 15 '23 05:11 remino

For the meantime, note as Docker will try to fetch the latest tag of an image by default, and there is no latest tag on Docker Hub, the command in the wiki will fail. Instead, I explicitly fetched the last tag:

docker run -ti --rm -v ~/pdf:/pdf -w /pdf pdf2htmlex/pdf2htmlex:0.18.8.rc2-master-20200820-alpine-3.12.0-x86_64 --zoom 1.3 input.pdf

It worked well and did a decent job. But sadly on my end, I still need to do some troubleshooting, because all the text is invisible from the 4th slide thereon in the PDF file I converted to HTML.

Thanks, it seems the project may be abandoned but at least your suggestion is getting me to the next step of unraveling this mess. Is there any other decent FOSS pdf2html platform?


Amazingly, this worked great on the first try, I am stunned. Such a shame the project is not maintained (broken usage instructions, funky build scripts where it's unclear how to build a fresh docker image).

This command worked like a CHARM!:

docker run --rm -it -v `pwd`/pdf:/pdf -w /pdf pdf2htmlex/pdf2htmlex:0.18.8.rc2-master-20200820-alpine-3.12.0-x86_64 /pdf/doc.pdf

I'm amazed, it's rare to find something so poorly maintained which works so well! I hope someone will fork it and continue..

sleaze avatar Aug 25 '24 20:08 sleaze