Any distributed capabilities planned?
Hello,
I recently came across Web-LLM and think that it holds great potential for a viable LLM approach.
In the past I have read about and tested Petals (https://github.com/bigscience-workshop/petals) which tries to distribute large LLM's across many clients systems but I think that it has some direct limitations in this.
I was wondering if Web-LLM is exploring any distribution ideas to expand to handle larger LLM's which tend to have more accuracy and capabilities.
Thanks
I won't speak for the WebLLM project. I have an interest in adding a feature like this to my separate project (https://github.com/DecentAppsNet) which wraps web-llm. Feel free to contact me outside of this issue board if you want more info.
Also, woolball (https://github.com/woolball-xyz) from might have what you need.
Hi @lonnietc, there's currently no plan to support distributed inference across multiple clients. WebLLM is intended to run in your browser and utilize the WebGPU abstraction provided by the browser.