web-llm
web-llm copied to clipboard
How to use WizardLM?
Excited to see support added for other models like WizardLM in https://github.com/mlc-ai/web-llm/pull/75. As I don't have the hardware to build this, would it be possible to run the GitHub Action to create the needed files?
I am not sure I understand what would need to change in order to move from Vicuna to WizardLM. I imagine I would need a different .wasm file and tokenizer.model, as well as to point cacheUrl somewhere else.
Any help would be much appreciated as I would love to try other models on WebLLM. Thanks!
FYI: @idosal & @tqchen
We are in the process of overhauling and standardizing the MLC-LLM and documenting the overall process, so hopefully it will become easier after that. Some of the build need quite a bit of memory so unlikely GH action would work, but we will release more model variants.
We are in the process of overhauling and standardizing the MLC-LLM and documenting the overall process, so hopefully it will become easier after that. Some of the build need quite a bit of memory so unlikely GH action would work, but we will release more model variants.
Ok sounds good. Is MLC-LLM connected to Web LLM? I understand about the GitHub Actions limitation, hopefully there will be a way to download the pre-built js/wasm files as building this project requires hardware that I do not have. Thanks!
We will be connecting things up so the same model artifact can be used across web, mobile and cli
The last few commits from @jinhongyii look very promising. I'd love to see the demo site updated so we have pre-built files and can see how these changes will reflect on a live demo. Thanks for all the work being done to make this more powerful!
I've managed to adapt (https://github.com/DustinBrett/daedalOS/commit/0b16bc24e707394ef35818be974de16653a57525) the latest code. Thanks for running a build. Now I just need to wait/find the WizardLM model on HuggingFace along with the wasm file.
What would be needed to run https://huggingface.co/TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ on Web LLM? Is it possible for someone to build the required files?
Hi, is there a way to use custom model for use in web LLM? And what's the limit for m2 pro 16gb Thanks
Please checkout the latest docs at https://mlc.ai/mlc-llm/docs/