mlc-llm
mlc-llm copied to clipboard
Universal LLM Deployment Engine with ML Compilation
raise ValueError(f"Unknown model type: {model_type}. Available ones: {list(MODELS.keys())}") ValueError: Unknown model type: nemotron-nas. Available ones: ['llama', 'mistral', 'gemma', 'gemma2', 'gemma3', 'gemma3_text', 'gpt2', 'mixtral', 'gpt_neox', 'gpt_bigcode', 'phi-msft', 'phi', 'phi3', 'phi3_v', 'qwen',...
## 🐛 Bug I am having troubles when I run mlc-llm with Gemma 3 models on M3 Pro Macbook (details are below). The error is same as follows. ``` libc++abi:...
## 🐛 Bug Followed the guidelines at [https://llm.mlc.ai/docs/deploy/ios.html](url) Getting error when running ``` cd mlc_llm/ios/MLCChat mlc_llm package ``` Verified that all prerequisites are correctly installed 1. Installed CMake (cmake 4.0.1)...
## ❓ General Questions While the inference speed is 2-3 times faster than llama.cpp, I observe some metrics degradation. For example, I have a simple test to do some punctuation/capitalization/correction...
## ❓ General Questions How to crosscompile the mlc llm android with adb shell to run it on my android phone, I already finish the process of mlc-llm package, E:\mlc_llm\mlc-llm\3rdparty\xgrammar\cpp\tokenizer.cc(217,33):...
This PR introduces the support of LLaVA and Phi-V on android device. Not thoroughly tested, but it works on my device. (Android 14.0) checkpoints: - https://huggingface.co/davidlightmysterion/llava-1.5-7b-hf-q4f16_1-MLC - https://huggingface.co/mlc-ai/Phi-3.5-vision-instruct-q4f16_1-MLC 
## 🐛 Bug After following the steps for installation, running MLCChat, clicking on a model (in this case, i chose the model that came with it "Llama-3.2-3B-Instruct-q4f16_1-MLC") I get this...
## ⚙️ Request New Models - Link to an existing implementation (e.g. Hugging Face/Github): https://huggingface.co/microsoft/bitnet-b1.58-2B-4T - Is this model architecture supported by MLC-LLM? No It has just come out. May...
## 🐛 Bug  I found clip model missing post_layernorm in forward, I tried to add post_layernorm, but I found there are some problem. ## To Reproduce Steps to reproduce...
## 🐛 Bug I've seen that in the last days support for gemma3 was added. Which is great! However, it does not seem to work with the webgpu target. I...