mlc-llm
mlc-llm copied to clipboard
Universal LLM Deployment Engine with ML Compilation
 这是8gen2处理器数据。运行时gpu占满,话说用的是int8还是多少。他能否运行在npu上?如果能运行在npu上,使用int4量化说不定有更快的速度(第一款npu支持int4的处理器)
when i run mlc_chat_cli, occurs the error on mac m1 /Users/jshao/Projects/mlc-ai-utils/tvm/src/runtime/metal/metal_device_api.mm:165: Intializing Metal device 0, name=Apple M1 Initializing the chat module... [15:54:34] /Users/jshao/Projects/mlc-ai-utils/tvm/src/runtime/relax_vm/vm.cc:768: --------------------------------------------------------------- An error occurred during the execution...
When I try to build on my computer, run ```python3 build.py --model vicuna213123-v1-7b --dtype float16 --target iphone --quantization-mode int3 --quantization-sym --quantization-storage-nbit 16 --max-seq-len 768``` environment: CentOS Linux release 8.2.2.2004, NVIDIA-SMI...
By process: 1、Install TVM Unity and compile successfully 2、Get the model weight 3、Build the model to the library exist python3 build.py --model vicuna-v1-7b --type float16 --target iphone --quantization-mode int3 --quantization-sym...
I've run Vicuna7B successfully on Android device. I'm trying running https://huggingface.co/bigscience/bloomz model on my device. Could you provide some tips about adding support for BloomZ? Are there any videos about...
Recently we refactored [`conversation.py`](https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py) in FastChat to make it easier to register new conversation templates without using too many if. Could you also consider following our updates? Then we can...
Hello, I am trying to reproduce your Vulkan compilation process for the `build.py` script on linux. The steps I've taken: - Install all the base requirements (MKL, Vulkan headers, SPIRV,...
Dear I try to build model for android according to the guide. when run "" `python3 build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768` I get below error: ```...
Which models are currently supported for deployment? Or how can other models be deployed through this project?