Charlie Ruan
Charlie Ruan
Ahh thanks for the catch; this should be due to how we need `'` instead of `"`. Will send a patch now and fix in 0.2.40
Could you try 0.2.40 which includes https://github.com/mlc-ai/web-llm/pull/440?
Thanks for the issue, will take a look this weekend. A low-level forward/sample API may still be helpful; DebugChat may be a reasonable place to put it in.
For the record, https://github.com/mlc-ai/web-llm/tree/main/examples/simple-chat-upload is an example of supporting local model in the app
Thanks for your interest! You can use `engine.interruptGenerate();` in WebLLM. See example https://github.com/mlc-ai/web-llm/blob/632d34725629b480b5b2772379ef5c150b1286f0/examples/streaming/src/streaming.ts#L48
Thanks for the question! The `wasm` is composed of various parts, including the kernel of the model (in WGSL), and runtime support (C++ code compiled into WASM). - The kernel...
Thanks for the question! Embedding models were supported in MLC-LLM a month ago (https://github.com/mlc-ai/mlc-llm/pull/2249) and hence there is no technical blocker to support in WebLLM with WebGPU acceleration but only...
Hi @ggaabe @Bert0324, npm 0.2.60 has initial support for embedding and RAG, check out usage here: https://github.com/mlc-ai/web-llm/tree/main/examples/embeddings Currently only [snowflake-arctic-embedding](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) is supported. Closing for now, feel free to open new...
This should be fixed by https://github.com/mlc-ai/web-llm/pull/571 and will be available in the next npm!
@talperetz @time2bot @nicmeriano Should be fixed with npm version `0.2.66`. Closing this for now, feel free to open another one if issue persists!