Tianqi Chen
Tianqi Chen
The error log indeed indicate possible OOM ("gpu get lost") because the model is too VRAM demanding
We need to know the total amount of VRAM that can be allocated. In Vulkan I think may corresponds to `PhysicalDeviceMemoryProperties ` and memory heap size. Note that there can...
Get it.. this is surprising, would be useful to look into the difference. I wonder if it has to do with way things are bundled
I think in this case, likely reacting to WebLLM might work better, then leading to a restart of the UI. https://github.com/mlc-ai/web-llm/blob/main/examples/simple-chat/src/simple_chat.ts#L239 cc @CharlieFRuan do you mind to take a look
thanks @tlopex do you mind send a PR
this is now supported
https://github.com/mlc-ai/mlc-llm/issues/2218
langchain js now have webllm integration
please checkout the latest instructions in https://mlc.ai/mlc-llm/docs/compilation/compile_models.html
Thanks @DustinBrett , do you mind to send a PR to fix this