Charlie Ruan comments

Results 166 comments of


                                            Charlie Ruan

I downloaded the model on huggingface locally, can I load the local model?

@cometta Hmm is there a specific reason for this? We do have APIs to delete the model weights from cache

[Question] Regarding mlc-llm context phase performance

Hi, out of curiosity, which version of mlc-llm are you using, what is the length of the context, and which model is it? I remember an **older version** of mlc-llm...

Wasm file generating but not compatibility issue.

Thanks for raising the issue. @harrywhoo is close to fixing this

Wasm file generating but not compatibility issue.

Hi folks, sorry for the delay, it is still undergoing. In the meantime, to unblock immediately, it might be helpful to checkout to the commits listed in this PR and...

Support a text embedding model

Hi, thanks for your interest! You can check out this example for how to use RAG w/ WebLLM: https://github.com/mlc-ai/web-llm/tree/main/examples/embeddings We support `snowflake-arctic-embed` as of now

Embedding models does not match between HF repo and .wasm files

Hi! Yes, the b4 and b32 wasms have different WebGPU kernels, but share the same weights (hence the same HF URLs). See https://github.com/mlc-ai/web-llm/pull/538 for details: > `b32` means the model...