llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

docker image name and GPU issue

Open stevegrubb opened this issue 1 year ago • 3 comments

I am using the latest version I just pip installed 0.0.45. In my environment (Fedora 39) I have export DOCKER_BINARY="podman" When I build, pretty much following the example, I get this image: localhost/distribution-my-local-stack:latest. I looked to see if this image was used in a container with "podman container ps -a" and the answer is no containers at all.

When I use "llama stack run", I eventually see this command on the screen: podman run -it -p 5000:5000 -v /home/sgrubb/.llama/builds/docker/my-local-stack-run.yaml:/app/config.yaml llamastack-my-local-stack python -m llama_stack.distribution.server.server --yaml_config /app/config.yaml --port 5000 And then it offers to download llamastack-my-local-stack:latest from various registries.

Why did it not use the local image? Could it be it has distribution instead of llamastack?

Also, I chose vllm as the inference service. It downloaded and installed nvidia drivers during build. Once I corrected the command to have distribution, it dies with "RuntimeError: Failed to infer device type." That's when I notice that gpus have not been passed. If nvidia drivers are loaded, it needs to have --device nvidia.com/gpu=all added to the docker command. It also wouldn't hurt to add --cgroup-conf=memory.high=32g or something configurable.

Also, the command should probably have --rm in it to erase the ephemeral container created when the command was invoked.

In summary, naming issue between the image created and the image used. And GPU's not being enabled.

stevegrubb avatar Oct 24 '24 20:10 stevegrubb