mlc-llm
                                
                                
                                
                                    mlc-llm copied to clipboard
                            
                            
                            
                        Universal LLM Deployment Engine with ML Compilation
What is the reason for the TestFlight MLC Chat iOS application to be listed in the section βiOS onlyβ and not being able to run on M1 Mac?
https://huggingface.co/HuggingFaceH4/starchat-alpha fails with issue not supported, bc it's https://huggingface.co/docs/transformers/model_doc/gpt_bigcode
Hello, In mlc-chat-cli, Vicuna does not is able to answer simple questions like square root. Any clue about ? Thank you guys (Windows 10, vicuna-v1-7b-q3f16_0) Examples.... USER: What is the...
I met problems when build MOSS. since the config.json of MOSS model (fnlp/moss-moon-003-sft) was changed by the developer. in the config.json file, several parameters are not included, like hidden_size. while...
I successfully deployed LLM on iPhone 12 Pro without any errors, but the output results are garbled. I think it might be an issue during model quantization, but I have...
## π Bug I try to run it on Android following this instruction: https://github.com/mlc-ai/mlc-llm/blob/main/android/README.md But I have an error at the step `make -j` ## To Reproduce Steps to reproduce...
## β General Questions cmake .. -- Forbidding undefined symbols in shared library, using -Wl,--no-undefined on platform Linux -- Didn't find the path to CCACHE, disabling ccache -- VTA build...
## π Bug ## To Reproduce python3 build.py --hf-path databricks/dolly-v2-3b --quantization q3f16_0 --max-seq-len 768 Weights exist at dist/models/dolly-v2-3b, skipping download. Using path "dist/models/dolly-v2-3b" for model "dolly-v2-3b" Database paths: ['log_db/redpajama-3b-q4f16', 'log_db/redpajama-3b-q4f32',...
Hi all, I take a lot of effort to run this demo, but it crashes with this error, could anyone give some support ??? ``` ./build/mlc_chat_cli --model dolly-v2-3b Use MLC...
## β General Questions After building the dolly-v2-3b well, I run the chat.py with the model, but the inference latency is just about tens of minutes. Is that normal? Or...