mlc-llm
mlc-llm copied to clipboard
Dockerfile for Nvidia GPU
Build and run it like this:
Download model
mkdir -p dist && git lfs install && \
git clone https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int3 dist/vicuna-v1-7b && \
git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/lib
Build it
docker compose build
Run it
docker compose run -i --rm mlc-llm
Quite interesting indeed, I have a few questions about the Docker. Actually, I've tested the image nvidia/vulkan:1.3-470, but it still didn't work on my A6000 with the 525 driver installed. From my experience, Vulkan is tied to the Nvidia driver, and some Nvidia drivers may not include Vulkan support. Also, the Docker container must share the same driver with the host. So, it's strange that the image is named nvidia/vulkan:1.3-470, implying that it has Vulkan 1.3 installed with the Nvidia 470 driver (since it can't be certain it's using the 470 driver). Do you have any thoughts on this?
I tried in my WSL, it gives error:
(mlc-chat) nw@DESKTOP-BADE1QI:/mnt/c/Ubuntu/mlc$ docker compose run -i --rm mlc-llm
[+] Running 1/0
✔ Network mlc_default Created 0.0s
terminate called after throwing an instance of 'tvm::runtime::InternalError'
what(): [11:52:39] /home/runner/work/utils/utils/tvm/src/runtime/vulkan/vulkan_instance.cc:111:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
Check failed: (__e == VK_SUCCESS) is false: Vulkan Error, code=-9: VK_ERROR_INCOMPATIBLE_DRIVER
Stack trace:
[bt] (0) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::Backtrace[abi:cxx11]()+0x27) [0x7fae3af2fc97]
[bt] (1) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x3f375) [0x7fae3aecd375]
[bt] (2) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::vulkan::VulkanInstance::VulkanInstance()+0x1a47) [0x7fae3b01ec07]
[bt] (3) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::vulkan::VulkanDeviceAPI::VulkanDeviceAPI()+0x40) [0x7fae3b01ae10]
[bt] (4) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::vulkan::VulkanDeviceAPI::Global()+0x4c) [0x7fae3b01b1ac]
[bt] (5) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x18d1ed) [0x7fae3b01b1ed]
[bt] (6) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x6c024) [0x7fae3aefa024]
[bt] (7) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x6c5c7) [0x7fae3aefa5c7]
[bt] (8) mlc_chat_cli(+0xeab0) [0x556e5780cab0]
/tmp/tmpi1l8mqe_: line 3: 15 Aborted /bin/bash -c mlc_chat_cli
ERROR conda.cli.main_run:execute(47): `conda run /bin/bash -c mlc_chat_cli` failed. (See above for error)
Hey I made a docker image that may help benchmark MLC LLM performance: https://github.com/junrushao/llm-perf-bench
On the other hand, I don’t really think docker is a perfect abstraction for those usecases (conda may be), but this docker should be a decent one