Results 167 comments of Charlie Ruan

Hi @louis030195, Safari should be supported. I tried WebLLM Chat on: - Macbook: macOS Sonoma 14.5 with Safari Technology Preview - iPhone: iOS 18.0 Developer Beta with Safari (need WebLLM...

They are due to the system prompt as shown in the `mlc-chat-config.json`: https://huggingface.co/mlc-ai/Llama-2-7b-chat-hf-q4f16_1-MLC/blob/main/mlc-chat-config.json#L33-L34, which follow the specification of the official model releases. If you'd like not to use a system...

Hi @tlopex, thanks for the questions! > 2. Does this mean that as long as the results are consistent, the implementation is acceptable? That is largely correct, as long as...

@tlopex Apologies for the late reply. Please keep the questions coming, it'd also be helpful for other people trying to learn the workflow. > 1. I found that the code...

可以复用`train.py`,然后把比如`MoELLaVAStablelmForCausalLM`替换成`EvalMoELLaVAStablelmForCausalLM`,后面就不用`initialize_moe_modules()`了;然后根据需要来`requires_grad_()`

可以复用train.py,然后把比如`MoELLaVAStablelmForCausalLM`替换成`EvalMoELLaVAStablelmForCausalLM`,后面就不用`initialize_moe_modules()`了;然后根据需要来`requires_grad_()`

Example of trying to allocate a KV cache with 900k context length (should be similar for trying to load a model that is too large):

Marked as a draft for now as it depends on https://github.com/apache/tvm/pull/17005

Thanks for reporting the error, will send a fix soon

Could you try 0.2.38? Should be fixed via https://github.com/mlc-ai/web-llm/pull/415. Apologies for the inconvenience