web-llm icon indicating copy to clipboard operation
web-llm copied to clipboard

High-performance In-browser LLM Inference Engine

Results 295 web-llm issues
Sort by recently updated
recently updated
newest added

Running `Llama-2-7b-chat-hf-q4f32_1` model on Android throws `Cannot find adapter that matches the request`. See the demo page: https://webllm.mlc.ai/ on android.

Hi there, it would be great if there was a way to edit and resend a message, similar to how you can in ChatGPT and other LLM providers. Thank you!

When downloading a model from Hugging Face I will very often get the following error: ``` index.js:2092 GET https://cdn-lfs-us-1.huggingface.co/repos/b6/29/b629b4991f1fc44df93315098c16e20f67d32d0396e59f57e5751d9182041f7e/ad401553effd67928f91e9bb36995317734a3f96e424ab44824927249eeac200?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27params_shard_110.bin%3B+filename%3D%22params_shard_110.bin%22%3B&response-content-type=application%2Foctet-stream&Expires=1706104533&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcwNjEwNDUzM319LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2I2LzI5L2I2MjliNDk5MWYxZmM0NGRmOTMzMTUwOThjMTZlMjBmNjdkMzJkMDM5NmU1OWY1N2U1NzUxZDkxODIwNDFmN2UvYWQ0MDE1NTNlZmZkNjc5MjhmOTFlOWJiMzY5OTUzMTc3MzRhM2Y5NmU0MjRhYjQ0ODI0OTI3MjQ5ZWVhYzIwMD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=Sfpo7G1QzfXixJpS-%7Esx%7Eck7AW09mX1xo-la7NL6VG67DgN%7E%7Ez7WdlsT5VzGq8PctsZM1Zwzl3Trz231lnqZQWJX7i5wBiWXUAiacHaLpace8AwjXpgEmY3M8p-kRXO2c9fKhINGXXWKn4wZLG0HIwrJz-%7EG-LzbaB4j7rq5f7APM-FVUE5O6znZrzHUvcAxTIOo25OY4g3AC18NMXEpdpEFapIHDRhHa6Y0EXkHL7VvkxVt%7EQeYFsmhc7C4-0BtTLat8LxaRvVaDA3RamwysZxEUnjrGS3Cw5gP7rCRNuF0H3VNCG8NvgD1GSfMZ3S4Wm98zzR2B3z-oBcMk3Hl1g__&Key-Pair-Id=KCD77M1F0VK2B net::ERR_NETWORK_CHANGED 200 (OK) Error: Cannot fetch https://huggingface.co/mlc-ai/Llama-2-7b-hf-q4f32_1-MLC/resolve/main/params_shard_110.bin err= NetworkError: Cache.add() encountered...

WebGPU is nowhere nearly as widely adopted as webgl 1 and 2. This means that maybe 15% of people out there with internet connection can run the project. I myself...

On my Pixel 7 Android device, the [maxStorageBufferBindingSize](https://gpuweb.github.io/gpuweb/#dom-supported-limits-maxstoragebufferbindingsize) limit is only 128. Sadly https://webllm.mlc.ai/#chat-demo requires 1024 for the `Llama-2-7b-chat-hf-q4f16_1` model as seen below when run on Chrome for Android. As...

when I use the shader-f16,I find my computer dose support the shader-f16. ![image](https://github.com/mlc-ai/web-llm/assets/103033429/ba6d7d46-0fe5-4feb-b2c1-b3e39fa2dd8f) so I note the code ,as follows: ![image](https://github.com/mlc-ai/web-llm/assets/103033429/1963b543-d341-48d6-9fdd-ab28cc78a8a3) but another question happens: ![image](https://github.com/mlc-ai/web-llm/assets/103033429/ada571eb-4690-4900-9d21-b5d80ef728e2)

when I run the web llm,the error: Init error, Error: This model requires WebGPU extension shader-f16, which is not enabled in this browser. You can try to launch Chrome Canary...

### Win10 Path: examples/simple-chat can not follow Readme.md instruction ``` \webllm-npm\web-llm\examples\simple-chat>npm install npm WARN deprecated [email protected]: Modern JS already guarantees Array#sort() is a stable sort, so this library is deprecated....

The next example only seems to work when the source is built locally, and not when it's installed from npm. Weirdly enough, if I install the source locally and then...