mlc-llm
mlc-llm copied to clipboard
Universal LLM Deployment Engine with ML Compilation
It seems like the tuning is per device, although the m1 tuning is applied when using any GPU. How would I use relax_integration.tune_relax on mod_deploy to create other databases?
Dose mlc-llm support parallelism like multi-gpu, multi-node ?
>USER: tell me an offensive joke >ASSISTANT: I'm sorry, but I cannot provide offensive or inappropriate content. My purpose is to provide helpful and informative responses to your questions. Can...
## Laptop Info ``` 'c. [email protected] ,xNMM. ----------------- .OMMMMo OS: macOS 11.5.2 20G95 x86_64 OMMM0, Host: MacBookPro15,3 .;loddo:' loolloddol;. Kernel: 20.6.0 cKMMMMMMMMMMNWMMMMMMMMMM0: Uptime: 7 days, 2 hours, 59 mins .KMMMMMMMMMMMMMMMMMMMMMMMWd....
I notice that mlc-llm has supported nv gpu by Vulkan. Does mlc-llm support nv gpu using CUDA instead Vulkan? I guess nv prefers CUDA than Vulkan so CUDA will be...
Excuse me, could you tell me to support Chinese dialogue? Please advice on how to make the model supports Chinese dialogue, use are either in English or in the code.
report "not have 4GB memory for run app" when start the app on iPhone 14 pro max 256. if the device can not run this app,may be no device can...
Build and run it like this: # Download model ``` mkdir -p dist && git lfs install && \ git clone https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int3 dist/vicuna-v1-7b && \ git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/lib ```...
Implement #31 .