Simon Mo
Simon Mo
This issue tracks follow up enhancements after initial support for the Deepseek V3 model. Please feel free to chime in and contribute! - [x] Follow up #11523: enhance testing with...
### Discussed in https://github.com/vllm-project/vllm/discussions/3072 Originally posted by **petrosbaltzis** February 28, 2024 Hello, The VLLM library gives the ability to load the model and the tokenizer either from a local folder...
### Motivation. OpenVINO backend was initially integrated as an alternatively to the CPU backend and has branched out the vLLM execution logic for every levels (executor, model runner, and attention...
This page is accessible via [roadmap.vllm.ai](https://roadmap.vllm.ai/) This is a living document! For each item here, we intend to link the RFC as well as discussion Slack channel in the [vLLM...
vLLM v0.8.4 and higher natively supports all Qwen3 and Qwen3MoE models. Example command: * `vllm serve Qwen/... --enable-reasoning --reasoning-parser deepseek_r1` * All models should work with the command as above....
### 🚀 The feature, motivation and pitch It is common to have a scenario where folks want to deploy multiple vLLM instances on a single machine due to the machine...
This page is accessible via [roadmap.vllm.ai](http://roadmap.vllm.ai/) This is a living document! For each item here, we intend to link the RFC as well as discussion Slack channel in the [vLLM...