web-llm Any distributed capabilities planned?

Hello,

I recently came across Web-LLM and think that it holds great potential for a viable LLM approach.

In the past I have read about and tested Petals (https://github.com/bigscience-workshop/petals) which tries to distribute large LLM's across many clients systems but I think that it has some direct limitations in this.

I was wondering if Web-LLM is exploring any distribution ideas to expand to handle larger LLM's which tend to have more accuracy and capabilities.

Thanks

Jun 24 '25 17:06 lonnietc

I won't speak for the WebLLM project. I have an interest in adding a feature like this to my separate project (https://github.com/DecentAppsNet) which wraps web-llm. Feel free to contact me outside of this issue board if you want more info.

Also, woolball (https://github.com/woolball-xyz) from might have what you need.

Aug 16 '25 22:08 erikh2000

Hi @lonnietc, there's currently no plan to support distributed inference across multiple clients. WebLLM is intended to run in your browser and utilize the WebGPU abstraction provided by the browser.

Nov 24 '25 23:11 akaashrp