llama-cpp-python
llama-cpp-python copied to clipboard
Add Dockerfile + build workflow
Fixes #70
This PR adds a Dockerfile and updates the release workflow to build the latest Docker image too. Both amd64 and arm64 arches are built.
@Niek do you mind moving this to the build release workflow?
@abetlen are you referring to build-and-release.yml? If we move the Docker step to this action, it can't use pip install though, it will have to download the artifacts and use that - not sure if this is what you intend.
Maybe we should directly add openblas support? would need those two lines:
RUN apt update && apt install -y libopenblas-dev
RUN LLAMA_OPENBLAS=1 pip install llama-cpp-python[server]
Good idea @jmtatsch - added now
Here is a docker file for a cublas capable container that should bring huge speed ups for cuda gpu owners after the next sync with upstream:
FROM nvidia/cuda:12.1.0-devel-ubuntu22.04
EXPOSE 8000
ENV MODEL=/models/ggml-vicuna-13b-1.1-q4_0.bin
# allow non local connections to api
ENV HOST=0.0.0.0
RUN apt update && apt install -y python3 python3-pip && LLAMA_CUBLAS=1 pip install llama-cpp-python[server]
ENTRYPOINT [ "python3", "-m", "llama_cpp.server" ]
Here is a docker file for a cublas capable container that should bring huge speed ups for cuda gpu owners after the next sync with upstream:
@jmtatsch where is requirements.txt coming from?
@jmtatsch where is
requirements.txtcoming from?
good catch, it isn't necessary at all. I cleaned it up above. In 0.1.36 CUBLA is broken anyhow for me, waiting for https://github.com/ggerganov/llama.cpp/pull/1128
@abetlen do you need any other changes?
@Niek if possible can we include @jmtatsch nvidia-docker container example as well in this PR? Ability to docker pull and run a GPU-accelerated container would be very helpful.
@abetlen We should make this two different containers then because the nvidia container with cublas is quite fat and not everyone has a Nvidia card. I will make a pull request once this one is merged. Sorry for hijacking your pull request @Niek
@Niek finally got a chance to merge this, great work! We now have a docker image.
@jmtatsch if you're still interested it would be awesome to get that cuBLAS-based image, happy to help there also.