Charlie Ruan
Charlie Ruan
I don't think the recent changes require redownloading. Besides, if we were to redownload a model, only the model library (tens of MB) is needed, not the weights.
@UXDart This shouldn't be related to the model library versioning issue described by the OP. Seems to be more related to https://github.com/mlc-ai/web-llm/issues/322 and https://github.com/mlc-ai/web-llm/issues/313. Will look into this issue today;...
@UXDart @kudaibergenu Just updated the npm to 0.2.24: https://github.com/mlc-ai/web-llm/pull/323 This should fix it. Let us know if issues persist. Thank you!
Update: we now support wasm versioning such that whenever a new version of wasm is required, we will download the new wasm (since the URL is different). The idea is...
Closing this issue as completed; feel free to open new ones!
Hi @devashish234073, if you look at https://github.com/mlc-ai/web-llm/blob/main/examples/simple-chat/src/gh-config.js, there is a field called `vram_required_MB` for each model. I would say it is an optimistic estimation and the actual usage should be...
While most of the models we support currently are **simply not trained** on > 4K context length, the Mistral model uses sliding window attention, meaning it **technically can** deal with...
Good point; we currently have relatively trivial unit tests, but ones that involve the actual usage of WebGPU are not included yet. I have not looked into how testing with...
That'd be great, thank you so much! I also found this blog that might be related https://developer.chrome.com/blog/supercharge-web-ai-testing
Hi, I think Chrome v120 should support f16 by default. Could you follow this issue https://github.com/mlc-ai/web-llm/issues/241 and see if https://webgpureport.org/ gives you the same result as the one in that...