Build and run it like this:

Download model

mkdir -p dist && git lfs install && \
git clone https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int3 dist/vicuna-v1-7b && \
git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/lib

Build it

docker compose build

Run it

docker compose run -i --rm mlc-llm

May 07 '23 01:05 codearranger

Quite interesting indeed, I have a few questions about the Docker. Actually, I've tested the image nvidia/vulkan:1.3-470, but it still didn't work on my A6000 with the 525 driver installed. From my experience, Vulkan is tied to the Nvidia driver, and some Nvidia drivers may not include Vulkan support. Also, the Docker container must share the same driver with the host. So, it's strange that the image is named nvidia/vulkan:1.3-470, implying that it has Vulkan 1.3 installed with the Nvidia 470 driver (since it can't be certain it's using the 470 driver). Do you have any thoughts on this?

May 07 '23 16:05 LeiWang1999

I tried in my WSL, it gives error:


(mlc-chat) nw@DESKTOP-BADE1QI:/mnt/c/Ubuntu/mlc$ docker compose run -i --rm mlc-llm
[+] Running 1/0
 ✔ Network mlc_default  Created                                                                                    0.0s
terminate called after throwing an instance of 'tvm::runtime::InternalError'
  what():  [11:52:39] /home/runner/work/utils/utils/tvm/src/runtime/vulkan/vulkan_instance.cc:111:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (__e == VK_SUCCESS) is false: Vulkan Error, code=-9: VK_ERROR_INCOMPATIBLE_DRIVER
Stack trace:
  [bt] (0) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::Backtrace[abi:cxx11]()+0x27) [0x7fae3af2fc97]
  [bt] (1) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x3f375) [0x7fae3aecd375]
  [bt] (2) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::vulkan::VulkanInstance::VulkanInstance()+0x1a47) [0x7fae3b01ec07]
  [bt] (3) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::vulkan::VulkanDeviceAPI::VulkanDeviceAPI()+0x40) [0x7fae3b01ae10]
  [bt] (4) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(tvm::runtime::vulkan::VulkanDeviceAPI::Global()+0x4c) [0x7fae3b01b1ac]
  [bt] (5) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x18d1ed) [0x7fae3b01b1ed]
  [bt] (6) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x6c024) [0x7fae3aefa024]
  [bt] (7) /opt/conda/envs/mlc-chat/bin/../lib/libtvm_runtime.so(+0x6c5c7) [0x7fae3aefa5c7]
  [bt] (8) mlc_chat_cli(+0xeab0) [0x556e5780cab0]


/tmp/tmpi1l8mqe_: line 3:    15 Aborted                 /bin/bash -c mlc_chat_cli
ERROR conda.cli.main_run:execute(47): `conda run /bin/bash -c mlc_chat_cli` failed. (See above for error)

May 15 '23 11:05 NeoWang9999

Hey I made a docker image that may help benchmark MLC LLM performance: https://github.com/junrushao/llm-perf-bench

On the other hand, I don’t really think docker is a perfect abstraction for those usecases (conda may be), but this docker should be a decent one

Jul 30 '23 16:07 junrushao

mlc-llm
mlc-llm copied to clipboard

Dockerfile for Nvidia GPU

Download model

Build it

Run it

mlc-llm mlc-llm copied to clipboard

Dockerfile for Nvidia GPU

Download model

Build it

Run it

mlc-llm
mlc-llm copied to clipboard