web-llm
web-llm copied to clipboard
High-performance In-browser LLM Inference Engine
HI, For running the web-llm in browser I've tried a NVIDIA T2000 and a build in i9 intel GPU. The T2000 is fastest in the prompt ingestion at ~17 tokens/s...
Whether to add p2p download support, after all, the models are too large to download for web user, webGPU should be inseparable from p2p, unless a powerful enterprise can support...
Great Repository! Is it within your scope to implement a webGPU accelerated version of Whisper? Not sure if this helps, but there is a [C port for Whisper wirh CPU...
Would be really nice to have WebGPU support for running other transformer models like sbert and embeddings models. For example, here's [transformer.js](https://xenova.github.io/transformers.js/) Thanks! @jinhongyii
Excited to see support added for other models like WizardLM in https://github.com/mlc-ai/web-llm/pull/75. As I don't have the hardware to build this, would it be possible to run the GitHub Action...
I got following errors on your demo page. ``` Find an error initializing the WebGPU device Error: Cannot find adapter that matches the request Init error, Error: Find an error...
I have an integrated AMD GPU (512 MB dedicated memory and 11.6GB shared memory) and a discreet NVIDIA GPU (6GB dedicated memory and 11.6 GB shared). Here are the results...
failed to find library when building model ``` Traceback (most recent call last): File "/Users/wangxj/web-llm/build.py", line 200, in build(mod, ARGS) File "/Users/wangxj/web-llm/build.py", line 174, in build ex.export_library(os.path.join(args.artifact_path, output_filename)) File "/Users/wangxj/web-llm/venv/lib/python3.9/site-packages/tvm/relax/vm_build.py",...
Hi there! I want to share that I've been enjoying using web-llm and its demo for some time now. However, I found that the demo didn't quite meet my needs...
maybe,webGL support is a better choice. and gpu.js can be used for this [gpu.js](https://github.com/gpujs/gpu.js)