infinity icon indicating copy to clipboard operation
infinity copied to clipboard

Docker Build Error: failed to create task for container: unable to start container process: exec:"v2" executable file not found in $PATH

Open lisanyambere opened this issue 10 months ago • 6 comments

Description

I'm attempting to build a Docker image that integrates both Infinity and the Stella_400M_v5 model for CPU development. Previously, I ran the Infinity image with a model ID successfully, but now I need a single image that includes both Infinity and the model. However, when I run the container, I encounter errors, most notably:

failed to create task for container: unable to start container process: exec: "v2" executable file not found in $PATH

Additionally, errors indicate that the model ONNX files cannot be found.

Steps to Reproduce

  1. Base Image: Start with the Infinity base image (michaelf34/infinity:latest-cpu).
  2. Working Directory: Set the working directory to /app.
  3. Install Dependencies: Install git and git-lfs to handle large files.
  4. Initialize Git LFS: Run git lfs install.
  5. Clone Model Repository: Create a /models directory and clone the Stella_400M_v5 repository from Hugging Face.
  6. Create Cache Directory: Create a cache directory at /app/.cache.
  7. Expose Port: Expose port 7997.
  8. Entry Point and Command: Set the entry point to Infinity's CLI and attempt to run the model with parameters.

Dockerfile

FROM michaelf34/infinity:latest-cpu WORKDIR /app RUN apt-get update && apt-get install -y git git-lfs && rm -rf /var/lib/apt/lists/* RUN git lfs install RUN mkdir -p /models &&
cd /models &&
git clone https://huggingface.co/NovaSearch/stella_en_400M_v5 RUN mkdir -p /app/.cache EXPOSE 7997 ENTRYPOINT ["infinity_emb"] CMD ["v2", "--engine", "optimum", "--model-id", "/models/stella_en_400M_v5", "--port", "7997"]

Expected Behavior

The Docker image should build without issues. On running the container, Infinity's CLI should start and correctly load the Stella_400M_v5 model on CPU.

Actual Behavior

The container fails to start with an error: failed to create task for container: unable to start container process: exec: "v2" executable file not found in $PATH There are also errors related to missing model ONNX files.(The onnx error stops when I download the onnx files locally from hugging face but the "v2" error remains)

Additional Context

The build is being performed for CPU development. The Infinity image works correctly when run separately with a model ID, which suggests the issue might be related to how the executable (v2) is specified or how the model files are managed.

Request for Assistance

  1. Clarification Needed: Is the error related to the way the executable is specified (i.e., "v2") or might it be an issue with file paths for the model ONNX files? Could there be an issue with the entrypoint or CMD syntax that's preventing execution?
  2. Suggestions: Any guidance on adjusting the Dockerfile or entrypoint configuration to successfully build and run the image would be greatly appreciated.

lisanyambere avatar Feb 18 '25 20:02 lisanyambere

Can you format your issue better. Thanks.

michaelfeil avatar Feb 18 '25 20:02 michaelfeil

Can you format your issue better. Thanks.

Done, thanks

lisanyambere avatar Feb 19 '25 14:02 lisanyambere

Hi, not formatted!

michaelfeil avatar Feb 19 '25 16:02 michaelfeil

done

lisanyambere avatar Feb 19 '25 18:02 lisanyambere

You are running the v2 command. Please verify the command by interactive ssh ing into the container. Thanks

michaelfeil avatar Feb 19 '25 19:02 michaelfeil

I tried to reproduce your error using this Dockerfile:

FROM michaelf34/infinity:latest-cpu
WORKDIR /app
RUN apt-get update && apt-get install -y git git-lfs && rm -rf /var/lib/apt/lists/*
RUN git lfs install
RUN mkdir -p /models && cd /models && git clone https://huggingface.co/BAAI/bge-small-en-v1.5
RUN mkdir -p /app/.cache
EXPOSE 7997
ENTRYPOINT ["infinity_emb"]
CMD ["v2", "--engine", "optimum", "--model-id", "/models/bge-small-en-v1.5", "--port", "7997"]

Then I did a docker build . which resulted in:

 => [internal] load build definition from Dockerfile                                                     0.0s
 => => transferring dockerfile: 446B                                                                     0.0s
 => [internal] load metadata for docker.io/michaelf34/infinity:latest-cpu                                0.7s
 => [internal] load .dockerignore                                                                        0.0s
 => => transferring context: 2B                                                                          0.0s
 => [1/6] FROM docker.io/michaelf34/infinity:latest-cpu@sha256:791e6b8a4eab6ed1bdea40358f6ce43cde908244  0.0s
 => CACHED [2/6] WORKDIR /app                                                                            0.0s
 => CACHED [3/6] RUN apt-get update && apt-get install -y git git-lfs && rm -rf /var/lib/apt/lists/*     0.0s
 => CACHED [4/6] RUN git lfs install                                                                     0.0s
 => CACHED [5/6] RUN mkdir -p /models && cd /models && git clone https://huggingface.co/BAAI/bge-small-  0.0s
 => CACHED [6/6] RUN mkdir -p /app/.cache                                                                0.0s
 => exporting to image                                                                                   0.0s
 => => exporting layers                                                                                  0.0s
 => => writing image sha256:6ee4feed27c84d5df2b8ed79b276b7e0bb8edc62c93b426e847bafc2e415da46  

Then I can run the image with:

docker run -p 7997:7997 sha256:6ee4feed27c84d5df2b8ed79b276b7e0bb8edc62c93b426e847bafc2e415da46

Which I then can check:

curl http://localhost:7997/models
{"data":[{"id":"models/bge-small-en-v1.5","stats":{"queue_fraction":0.0,"queue_absolute":0,"results_pending":0,"batch_size":32},"object":"model","owned_by":"infinity","created":1740021528,"backend":"optimum","capabilities":["embed"]}],"object":"list"}

Can you check your Dockerfile to make sure you did not change the entrypoint to v2? That could result in the error as described.

wirthual avatar Feb 20 '25 03:02 wirthual