mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

Universal LLM Deployment Engine with ML Compilation

Results 578 mlc-llm issues
Sort by recently updated
recently updated
newest added

What is the reason for the TestFlight MLC Chat iOS application to be listed in the section β€œiOS only” and not being able to run on M1 Mac?

feature request

https://huggingface.co/HuggingFaceH4/starchat-alpha fails with issue not supported, bc it's https://huggingface.co/docs/transformers/model_doc/gpt_bigcode

new-models

Hello, In mlc-chat-cli, Vicuna does not is able to answer simple questions like square root. Any clue about ? Thank you guys (Windows 10, vicuna-v1-7b-q3f16_0) Examples.... USER: What is the...

question

I met problems when build MOSS. since the config.json of MOSS model (fnlp/moss-moon-003-sft) was changed by the developer. in the config.json file, several parameters are not included, like hidden_size. while...

new-models

I successfully deployed LLM on iPhone 12 Pro without any errors, but the output results are garbled. I think it might be an issue during model quantization, but I have...

## πŸ› Bug I try to run it on Android following this instruction: https://github.com/mlc-ai/mlc-llm/blob/main/android/README.md But I have an error at the step `make -j` ## To Reproduce Steps to reproduce...

bug

## ❓ General Questions cmake .. -- Forbidding undefined symbols in shared library, using -Wl,--no-undefined on platform Linux -- Didn't find the path to CCACHE, disabling ccache -- VTA build...

type: question

## πŸ› Bug ## To Reproduce python3 build.py --hf-path databricks/dolly-v2-3b --quantization q3f16_0 --max-seq-len 768 Weights exist at dist/models/dolly-v2-3b, skipping download. Using path "dist/models/dolly-v2-3b" for model "dolly-v2-3b" Database paths: ['log_db/redpajama-3b-q4f16', 'log_db/redpajama-3b-q4f32',...

bug

Hi all, I take a lot of effort to run this demo, but it crashes with this error, could anyone give some support ??? ``` ./build/mlc_chat_cli --model dolly-v2-3b Use MLC...

trouble shooting

## ❓ General Questions After building the dolly-v2-3b well, I run the chat.py with the model, but the inference latency is just about tens of minutes. Is that normal? Or...

question