mlc-llm
                                
                                
                                
                                    mlc-llm copied to clipboard
                            
                            
                            
                        Universal LLM Deployment Engine with ML Compilation
python3 build.py --hf-path databricks/dolly-v2-3b --quantization q3f16_0 --max-seq-len 768 Weights exist at dist/models/dolly-v2-3b, skipping download. Using model path dist/models/dolly-v2-3b Automatically configuring target: cuda -keys=cuda,gpu -arch=sm_80 -max_num_threads=1024 -thread_warp_size=32 Segmentation fault (core dumped)...
mlc_ chat_ Cli is running on my WSL, but it seems that I didn't use GPU, but instead used CPU entirely, which is very slow. How can I use GPU...
I tested the 'mlc_chat_cli' command in a Linux environment, and I want to deploy it as an API that can call services. Is there a convenient way to deploy it,...
build code: `python build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768` error: ``` Using model path dist/models/vicuna-v1-7b Load cached module from dist/vicuna-v1-7b-q4f16_0/mod_cache_before_build_android.pkl and skip tracing. You can use --use-cache=0...
Cannot find vicuna-v1-7b lib in preferred path "dist/vicuna-v1-7b/float16/vicuna-v1-7b_metal_float16.dylib" or other candidate paths%
when I run `python3 build.py --model vicuna-v1-7b` , I get a error following:  Does anyone resolve it ?
Hello. I follow your building instructions from README.MD and it is not reproducible. It would be great if the team can fix the building instructions. One idea: create a new...
 I'm using conda enviroment, with python=3.10, and I did: `pip install apache-tvm pytest` why that happend? did I installed tvm the wrong way?
is there a hakcy way to monitor current system metrics on iPadOS?
 下面是vulkaninfo --summary的结果 