web-llm
web-llm copied to clipboard
High-performance In-browser LLM Inference Engine
8GB GPU
vicuna model: After some Input Text and some generated questions, the GPU VRAM (win10 taskmanager) goes up to fill 8GB and then token generation become slow. but direcly after start...
Pitch: Maybe considering creating a vscode extension that can run locally using webgpu for a specific programming language... example: https://huggingface.co/bigcode/tiny_starcoder_py ? Demonstrate that using local resources might be enough for...
I Had the same [issue with WEBSD](https://github.com/mlc-ai/web-stable-diffusion/issues/48) when i switched to canary WEBsd Worked but WEBllm still gives the error frequently(Image attached) 
Considering that many users deploy services to the public network for access, moderation is actually quite crucial. Solutions Suggest adding a content audit and filtering service with controllable switch configuration....
Hi, I am trying to setup the WebLLM simple-chat from examples/simple-chat. `npm install` and `npm start` are successful and I get the web service up. However the first message in...
Is it possible to compile web-llm into a wasm file? I want to use wasmtime as the backend (or other wasm runtimes) to run LLM. If this is feasible, many...
I discussed this in Discord. I would like a way to change the system message after doing a completion. Right now the only way to do that is to trigger...
Adding web-llm as a LLM in Langchain would be really useful. https://js.langchain.com/docs/getting-started/guide-llm For example it would be nice to have a web-llm option that works similar to the OpenAI one....
Install all the prerequisites for compilation: [emscripten](https://emscripten.org/). It is an LLVM-based compiler that compiles C/C++ source code to WebAssembly. Follow the [installation instruction](https://emscripten.org/docs/getting_started/downloads.html#installation-instructions-using-the-emsdk-recommended) to install the latest emsdk. Source emsdk_env.sh...