marker icon indicating copy to clipboard operation
marker copied to clipboard

Creating a Docker image or Dockerfile from this repo.

Open michaelfeil opened this issue 1 year ago • 8 comments

It would be awesome to build a docker image for this repo.

michaelfeil avatar Jan 18 '24 00:01 michaelfeil

I've got one working here: https://hub.docker.com/r/dibz15/marker_docker

~~It works, but right now it re-downloads the necessary resources on each run. If someone figures out how to get those to cache, that'd be great!~~ Nevermind, got the HF models cached in the image now!

Dibz15 avatar Jan 22 '24 16:01 Dibz15

@Dibz15 would it be possible to share the Dockerfile for building it locally. It seems the Convert multiple file script "convert.py" doesn't work, probably because of a missing dependency.

agarwalshashank95 avatar Feb 08 '24 21:02 agarwalshashank95

@agarwalshashank95 You can build off of their image, e.g.

FROM dibz15/marker_docker:latest
RUN pip install ray
RUN pip uninstall -y torch torchvision torchaudio
RUN pip3 install torch torchvision
COPY local.env /usr/src/app/marker/marker/local.env
RUN mkdir /.cache && chmod -R 777 /.cache

with local.env in the same directory as

TORCH_DEVICE="cuda"

and

USER_ID=$(id -u)
GROUP_ID=$(id -g)

docker run --shm-size=10.24gb --gpus all -v "$PDF_DIR_SANITIZED":/pdfs --user $USER_ID:$GROUP_ID marker:latest python convert.py /pdfs/ /pdfs/

That said, it be great if there were a repo managed Dockerfile that we could all reference ...

robinsonkwame avatar Feb 15 '24 22:02 robinsonkwame

I started a repo here that uses @Dibz15 's docker image to generate markdown

robinsonkwame avatar Feb 16 '24 15:02 robinsonkwame

@robinsonkwame Thanks a ton! Didn't realize I could have used the existing Docker itself and built on top of that. This would work perfectly for my use case. But yes I agree, there should be an official docker that we can all refer to.

agarwalshashank95 avatar Feb 16 '24 15:02 agarwalshashank95

Hey, sorry I lost track of this. I didn't plan to run mine on a system with CUDA supported, so I didn't even think about that, sorry. Looks like it's been taken care of, though.

Dibz15 avatar Feb 16 '24 15:02 Dibz15

Here's the repo that I hosted the Dockerfile. I forgot to set it public.

Dibz15 avatar Feb 16 '24 15:02 Dibz15

how do I add fast api to this app?

musarehmani291 avatar Jun 12 '24 01:06 musarehmani291