mlc-llm issues

Consider supporting 8bit quantization

Based on experimenting with GPTQ-for-LLaMa, int4 quantization seems to introduce 3-5% drop in perplexity, while int8 is almost identical to fp16. Would it be possible to use int8 quantization with...

emvw7yf

feature request

Native Windows UI? Using RAM, To Make Up For Not Having Enough VRAM?

will there ever be an option where you can just, run a normal .exe file, without having to run all these commands and stuff? if there is one could it...

SpaceMageWhatever

question

The sample installation code doesn't work on Macbook ARM

6

Hello there, I found this problem while executing the sample code given for installation on Macbook M1. How should I resolve this? ``` An error occurred during the execution of...

dtu222371

type: trouble shooting

CPU offloading

6

Incredible project, i managed to run the model with good speed on my hardware (AMD) thanks. I have a question do you have any plans to offload the weights and...

LiliumSancta

feature request

Android port

3

Hopefully we can expect and Android port of the same.

GeorvityLabs

type: feature request

Missing instructions on installing additional models

7

Hey there, congratulations on a great release! The app works great on a Mac and the installation was very straightforward. Do you have plans for growing the `mlc_chat_cli` into a...

execveat

documentation

Does this mean I should buy a 96GB RAM Macbook?

The 4090 can't run the 65B model. Can I run it on the macbook with this?

Charuru

How to remove restrictions on what it can output?

I can't understand why following the path of ChatGPT about restrictions on what it can or cannot say, how can I disable the usual "As na AI language model, I...

thepra

[New Model] Support moss-moon-003-sft

This PR supports compilation and deployment of MOSS model, especially moss-moon-003-sft.

yzh119

I have this error when I ` git clone https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int3 dist/vicuna-v1-7b` > Error downloading object: float16/params_shard_1.bin (0fb70c2): Smudge error: Error downloading float16/params_shard_1.bin (0fb70c297b47ce4ecade5f7875c4c90f518069bab49f359a1644766b2279e8e2): batch response: Post "https://huggingface.co/mlc-ai/demo-vicuna-v1-7b-int3.git/info/lfs/objects/batch": dial tcp: lookup...

durmazt

trouble shooting

mlc-llm
mlc-llm copied to clipboard

Metadata

Consider supporting 8bit quantization

Native Windows UI? Using RAM, To Make Up For Not Having Enough VRAM?

The sample installation code doesn't work on Macbook ARM

CPU offloading

Android port

Missing instructions on installing additional models

Does this mean I should buy a 96GB RAM Macbook?

How to remove restrictions on what it can output?

[New Model] Support moss-moon-003-sft

huggingface.co: no such host

← Metadata

Owner

Metadata

mlc-llm mlc-llm copied to clipboard

Metadata

← Metadata

Owner

Metadata

mlc-llm
mlc-llm copied to clipboard