List of currently available models.
The latest list of models are in the code as JSON.
Here is a markdown table for the same with categorization
Thanks for this great project. Is there a clean table list for all available models?
Currently, it is burried in the code --> https://github.com/mlc-ai/web-llm/blob/main/src/config.ts#L293
Update: I complied the below from the link
Llama-3.2 Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Llama-3.2-1B-Instruct-q4f32_1-MLC | 1128.82 | Yes | 4096 | - | - |
| Llama-3.2-1B-Instruct-q4f16_1-MLC | 879.04 | Yes | 4096 | - | - |
| Llama-3.2-1B-Instruct-q0f32-MLC | 5106.26 | Yes | 4096 | - | - |
| Llama-3.2-1B-Instruct-q0f16-MLC | 2573.13 | Yes | 4096 | - | - |
| Llama-3.2-3B-Instruct-q4f32_1-MLC | 2951.51 | Yes | 4096 | - | - |
| Llama-3.2-3B-Instruct-q4f16_1-MLC | 2263.69 | Yes | 4096 | - | - |
Llama-3.1 Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Llama-3.1-8B-Instruct-q4f32_1-MLC-1k | 5295.70 | Yes | 1024 | - | - |
| Llama-3.1-8B-Instruct-q4f16_1-MLC-1k | 4598.34 | Yes | 1024 | - | - |
| Llama-3.1-8B-Instruct-q4f32_1-MLC | 6101.01 | No | 4096 | - | - |
| Llama-3.1-8B-Instruct-q4f16_1-MLC | 5001.00 | No | 4096 | - | - |
DeepSeek-R1-Distill-Qwen Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC | 5106.67 | No | 4096 | - | - |
| DeepSeek-R1-Distill-Qwen-7B-q4f32_1-MLC | 5900.09 | No | 4096 | - | - |
DeepSeek-R1-Distill-Llama Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| DeepSeek-R1-Distill-Llama-8B-q4f32_1-MLC | 6101.01 | No | 4096 | - | - |
| DeepSeek-R1-Distill-Llama-8B-q4f16_1-MLC | 5001.00 | No | 4096 | - | - |
Hermes Models (Llama Base)
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Hermes-2-Theta-Llama-3-8B-q4f16_1-MLC | 4976.13 | No | 4096 | - | - |
| Hermes-2-Theta-Llama-3-8B-q4f32_1-MLC | 6051.27 | No | 4096 | - | - |
| Hermes-2-Pro-Llama-3-8B-q4f16_1-MLC | 4976.13 | No | 4096 | - | - |
| Hermes-2-Pro-Llama-3-8B-q4f32_1-MLC | 6051.27 | No | 4096 | - | - |
| Hermes-3-Llama-3.2-3B-q4f32_1-MLC | 2951.51 | Yes | 4096 | - | - |
| Hermes-3-Llama-3.2-3B-q4f16_1-MLC | 2263.69 | Yes | 4096 | - | - |
| Hermes-3-Llama-3.1-8B-q4f32_1-MLC | 5779.27 | No | 4096 | - | - |
| Hermes-3-Llama-3.1-8B-q4f16_1-MLC | 4876.13 | No | 4096 | - | - |
Hermes Models (Mistral Base)
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Hermes-2-Pro-Mistral-7B-q4f16_1-MLC | 4033.28 | No | 4096 | - | shader-f16 |
Phi-3.5 Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Phi-3.5-mini-instruct-q4f16_1-MLC | 3672.07 | No | 4096 | - | - |
| Phi-3.5-mini-instruct-q4f32_1-MLC | 5483.12 | No | 4096 | - | - |
| Phi-3.5-mini-instruct-q4f16_1-MLC-1k | 2520.07 | Yes | 1024 | - | - |
| Phi-3.5-mini-instruct-q4f32_1-MLC-1k | 3179.12 | Yes | 1024 | - | - |
| Phi-3.5-vision-instruct-q4f16_1-MLC | 3952.18 | Yes | 4096 | VLM | - |
| Phi-3.5-vision-instruct-q4f32_1-MLC | 5879.84 | Yes | 4096 | VLM | - |
Mistral Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Mistral-7B-Instruct-v0.3-q4f16_1-MLC | 4573.39 | No | 4096 | - | shader-f16 |
| Mistral-7B-Instruct-v0.3-q4f32_1-MLC | 5619.27 | No | 4096 | - | - |
| Mistral-7B-Instruct-v0.2-q4f16_1-MLC | 4573.39 | No | 4096 | - | shader-f16 |
| OpenHermes-2.5-Mistral-7B-q4f16_1-MLC | 4573.39 | No | 4096 | - | shader-f16 |
| NeuralHermes-2.5-Mistral-7B-q4f16_1-MLC | 4573.39 | No | 4096 | - | shader-f16 |
| WizardMath-7B-V1.1-q4f16_1-MLC | 4573.39 | No | 4096 | - | shader-f16 |
SmolLM2 Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| SmolLM2-1.7B-Instruct-q4f16_1-MLC | 1774.19 | Yes | 4096 | - | shader-f16 |
| SmolLM2-1.7B-Instruct-q4f32_1-MLC | 2692.38 | Yes | 4096 | - | - |
| SmolLM2-360M-Instruct-q0f16-MLC | 871.99 | Yes | 4096 | - | shader-f16 |
| SmolLM2-360M-Instruct-q0f32-MLC | 1743.99 | Yes | 4096 | - | - |
| SmolLM2-360M-Instruct-q4f16_1-MLC | 376.06 | Yes | 4096 | - | shader-f16 |
| SmolLM2-360M-Instruct-q4f32_1-MLC | 579.61 | Yes | 4096 | - | - |
| SmolLM2-135M-Instruct-q0f16-MLC | 359.69 | Yes | 4096 | - | shader-f16 |
| SmolLM2-135M-Instruct-q0f32-MLC | 719.38 | Yes | 4096 | - | - |
Gemma-2 Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| gemma-2-2b-it-q4f16_1-MLC | 1895.30 | No | 4096 | - | shader-f16 |
| gemma-2-2b-it-q4f32_1-MLC | 2508.75 | No | 4096 | - | - |
| gemma-2-2b-it-q4f16_1-MLC-1k | 1583.30 | Yes | 1024 | - | shader-f16 |
| gemma-2-2b-it-q4f32_1-MLC-1k | 1884.75 | Yes | 1024 | - | - |
| gemma-2-9b-it-q4f16_1-MLC | 6422.01 | No | 4096 | - | shader-f16 |
| gemma-2-9b-it-q4f32_1-MLC | 8383.33 | No | 4096 | - | - |
| gemma-2-2b-jpn-it-q4f16_1-MLC | 1895.30 | Yes | 4096 | - | shader-f16 |
| gemma-2-2b-jpn-it-q4f32_1-MLC | 2508.75 | Yes | 4096 | - | - |
Qwen-2.5 Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Qwen2.5-0.5B-Instruct-q4f16_1-MLC | 944.62 | Yes | 4096 | - | - |
| Qwen2.5-0.5B-Instruct-q4f32_1-MLC | 1060.20 | Yes | 4096 | - | - |
| Qwen2.5-0.5B-Instruct-q0f16-MLC | 1624.12 | Yes | 4096 | - | - |
| Qwen2.5-0.5B-Instruct-q0f32-MLC | 2654.75 | Yes | 4096 | - | - |
| Qwen2.5-1.5B-Instruct-q4f16_1-MLC | 1629.75 | Yes | 4096 | - | - |
| Qwen2.5-1.5B-Instruct-q4f32_1-MLC | 1888.97 | Yes | 4096 | - | - |
| Qwen2.5-3B-Instruct-q4f16_1-MLC | 2504.76 | Yes | 4096 | - | - |
| Qwen2.5-3B-Instruct-q4f32_1-MLC | 2893.64 | Yes | 4096 | - | - |
| Qwen2.5-7B-Instruct-q4f16_1-MLC | 5106.67 | No | 4096 | - | - |
| Qwen2.5-7B-Instruct-q4f32_1-MLC | 5900.09 | No | 4096 | - | - |
| Qwen2.5-Coder-0.5B-Instruct-q4f16_1-MLC | 944.62 | Yes | 4096 | - | - |
| Qwen2.5-Coder-0.5B-Instruct-q4f32_1-MLC | 1060.20 | Yes | 4096 | - | - |
| Qwen2.5-Coder-0.5B-Instruct-q0f16-MLC | 1624.12 | Yes | 4096 | - | - |
| Qwen2.5-Coder-0.5B-Instruct-q0f32-MLC | 2654.75 | Yes | 4096 | - | - |
| Qwen2.5-Coder-1.5B-Instruct-q4f16_1-MLC | 1629.75 | No | 4096 | - | - |
| Qwen2.5-Coder-1.5B-Instruct-q4f32_1-MLC | 1888.97 | No | 4096 | - | - |
| Qwen2.5-Coder-3B-Instruct-q4f16_1-MLC | 2504.76 | Yes | 4096 | - | - |
| Qwen2.5-Coder-3B-Instruct-q4f32_1-MLC | 2893.64 | Yes | 4096 | - | - |
| Qwen2.5-Coder-7B-Instruct-q4f16_1-MLC | 5106.67 | No | 4096 | - | - |
| Qwen2.5-Coder-7B-Instruct-q4f32_1-MLC | 5900.09 | No | 4096 | - | - |
| Qwen2.5-Math-1.5B-Instruct-q4f16_1-MLC | 1629.75 | Yes | 4096 | - | - |
| Qwen2.5-Math-1.5B-Instruct-q4f32_1-MLC | 1888.97 | Yes | 4096 | - | - |
StableLM Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| stablelm-2-zephyr-1_6b-q4f16_1-MLC | 2087.66 | No | 4096 | - | - |
| stablelm-2-zephyr-1_6b-q4f32_1-MLC | 2999.33 | No | 4096 | - | - |
| stablelm-2-zephyr-1_6b-q4f16_1-MLC-1k | 1511.66 | Yes | 1024 | - | - |
| stablelm-2-zephyr-1_6b-q4f32_1-MLC-1k | 1847.33 | Yes | 1024 | - | - |
RedPajama Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| RedPajama-INCITE-Chat-3B-v1-q4f16_1-MLC | 2972.09 | No | 2048 | - | shader-f16 |
| RedPajama-INCITE-Chat-3B-v1-q4f32_1-MLC | 3928.09 | No | 2048 | - | - |
| RedPajama-INCITE-Chat-3B-v1-q4f16_1-MLC-1k | 2041.09 | Yes | 1024 | - | shader-f16 |
| RedPajama-INCITE-Chat-3B-v1-q4f32_1-MLC-1k | 2558.09 | Yes | 1024 | - | - |
TinyLlama Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| TinyLlama-1.1B-Chat-v1.0-q4f16_1-MLC | 697.24 | Yes | 2048 | - | shader-f16 |
| TinyLlama-1.1B-Chat-v1.0-q4f32_1-MLC | 839.98 | Yes | 2048 | - | - |
| TinyLlama-1.1B-Chat-v1.0-q4f16_1-MLC-1k | 675.24 | Yes | 1024 | - | shader-f16 |
| TinyLlama-1.1B-Chat-v1.0-q4f32_1-MLC-1k | 795.98 | Yes | 1024 | - | - |
Older / Less Practical Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| Llama-3.1-70B-Instruct-q3f16_1-MLC | 31153.13 | No | 4096 | - | - |
| Qwen2-0.5B-Instruct-q4f16_1-MLC | 944.62 | Yes | 4096 | - | - |
| Qwen2-0.5B-Instruct-q0f16-MLC | 1624.12 | Yes | 4096 | - | - |
| Qwen2-0.5B-Instruct-q0f32-MLC | 2654.75 | Yes | 4096 | - | - |
| Qwen2-1.5B-Instruct-q4f16_1-MLC | 1629.75 | Yes | 4096 | - | - |
| Qwen2-1.5B-Instruct-q4f32_1-MLC | 1888.97 | Yes | 4096 | - | - |
| Qwen2-7B-Instruct-q4f16_1-MLC | 5106.67 | No | 4096 | - | - |
| Qwen2-7B-Instruct-q4f32_1-MLC | 5900.09 | No | 4096 | - | - |
| Qwen2-Math-1.5B-Instruct-q4f16_1-MLC | 1629.75 | Yes | 4096 | - | - |
| Qwen2-Math-1.5B-Instruct-q4f32_1-MLC | 1888.97 | Yes | 4096 | - | - |
| Qwen2-Math-7B-Instruct-q4f16_1-MLC | 5106.67 | No | 4096 | - | - |
| Qwen2-Math-7B-Instruct-q4f32_1-MLC | 5900.09 | No | 4096 | - | - |
| Llama-3-8B-Instruct-q4f32_1-MLC-1k | 5295.70 | Yes | 1024 | - | - |
| Llama-3-8B-Instruct-q4f16_1-MLC-1k | 4598.34 | Yes | 1024 | - | - |
| Llama-3-8B-Instruct-q4f32_1-MLC | 6101.01 | No | 4096 | - | - |
| Llama-3-8B-Instruct-q4f16_1-MLC | 5001.00 | No | 4096 | - | - |
| Llama-3-70B-Instruct-q3f16_1-MLC | 31153.13 | No | 4096 | - | - |
| Phi-3-mini-4k-instruct-q4f16_1-MLC | 3672.07 | No | 4096 | - | - |
| Phi-3-mini-4k-instruct-q4f32_1-MLC | 5483.12 | No | 4096 | - | - |
| Phi-3-mini-4k-instruct-q4f16_1-MLC-1k | 2520.07 | Yes | 1024 | - | - |
| Phi-3-mini-4k-instruct-q4f32_1-MLC-1k | 3179.12 | Yes | 1024 | - | - |
| Llama-2-7b-chat-hf-q4f32_1-MLC-1k | 5284.01 | No | 1024 | - | - |
| Llama-2-7b-chat-hf-q4f16_1-MLC-1k | 4618.52 | No | 1024 | - | shader-f16 |
| Llama-2-7b-chat-hf-q4f32_1-MLC | 9109.03 | No | 4096 | - | - |
| Llama-2-7b-chat-hf-q4f16_1-MLC | 6749.02 | No | 4096 | - | shader-f16 |
| Llama-2-13b-chat-hf-q4f16_1-MLC | 11814.09 | No | 4096 | - | shader-f16 |
| gemma-2b-it-q4f16_1-MLC | 1476.52 | No | 4096 | - | shader-f16 |
| gemma-2b-it-q4f32_1-MLC | 1750.66 | No | 4096 | - | - |
| gemma-2b-it-q4f16_1-MLC-1k | 1476.52 | Yes | 1024 | - | shader-f16 |
| gemma-2b-it-q4f32_1-MLC-1k | 1750.66 | Yes | 1024 | - | - |
| phi-2-q4f16_1-MLC | 3053.97 | No | 2048 | - | shader-f16 |
| phi-2-q4f32_1-MLC | 4032.48 | No | 2048 | - | - |
| phi-2-q4f16_1-MLC-1k | 2131.97 | Yes | 1024 | - | shader-f16 |
| phi-2-q4f32_1-MLC-1k | 2740.48 | Yes | 1024 | - | - |
| phi-1_5-q4f16_1-MLC | 1210.09 | Yes | 2048 | - | shader-f16 |
| phi-1_5-q4f32_1-MLC | 1682.09 | Yes | 2048 | - | - |
| phi-1_5-q4f16_1-MLC-1k | 1210.09 | Yes | 1024 | - | shader-f16 |
| phi-1_5-q4f32_1-MLC-1k | 1682.09 | Yes | 1024 | - | - |
| TinyLlama-1.1B-Chat-v0.4-q4f16_1-MLC | 697.24 | Yes | 2048 | - | shader-f16 |
| TinyLlama-1.1B-Chat-v0.4-q4f32_1-MLC | 839.98 | Yes | 2048 | - | - |
| TinyLlama-1.1B-Chat-v0.4-q4f16_1-MLC-1k | 675.24 | Yes | 1024 | - | shader-f16 |
| TinyLlama-1.1B-Chat-v0.4-q4f32_1-MLC-1k | 795.98 | Yes | 1024 | - | - |
Embedding Models
| Model ID | VRAM (MB) | Low Resource? | Context Window | Model Type | Required Features |
|---|---|---|---|---|---|
| snowflake-arctic-embed-m-q0f32-MLC-b32 | 1407.51 | - | 512 | embedding | - |
| snowflake-arctic-embed-m-q0f32-MLC-b4 | 539.40 | - | 512 | embedding | - |
| snowflake-arctic-embed-s-q0f32-MLC-b32 | 1022.82 | - | 512 | embedding | - |
| snowflake-arctic-embed-s-q0f32-MLC-b4 | 238.71 | - | 512 | embedding | - |
Thank you very much @metalshanked Can you add download size too?
Hi,
I’ve created a serverless single-page HTML chat where users can select models from a dropdown. The dropdown is pre-populated with four models, but I’d like the README to point users to a full, up-to-date list of available models—so they can easily customize their own dropdown by editing the .html source.
Should I link to this issue (#683) and the config.ts file, or is there an official page documenting all models?
Thanks in advance!
Glauco
This is the longest stretch without any updates-- over 4 months since the repo received any changes-- I do wonder if MLC plans on keeping WebLLM alive? They were making amazing progress throughout 2024 but came to an abrupt stop this year.
Thanks all for the input. This is a great point and we should definitely add a list of models somewhere, and point to that in README, documentation, webpage, etc
I do wonder if MLC plans on keeping WebLLM alive
Yes, we would want to keep WebLLM alive. Apologies for the slowdown in development in the past few months. Try out the Qwen3 we just added yesterday -- you can toggle thinking with this model! Go to https://chat.webllm.ai/ and select Qwen3, try toggling the thinking icon in the toolbar.