Canlin Guo

Shandong University Qingdao, China

Results 56 comments of


                                            Canlin Guo

[WIP][NPU][Model] Support Qwen3-Omni for NPU

## Limitation - We need at least 4 cards(910B) on NPU instead of 2 cards(A100, H100) on GPU to avoid OOM. ## Known Issues ### Overview - Same as Qwen2.5-Omni,...

[WIP][NPU][Model] Support Qwen3-Omni for NPU

Now qwen3-omni can run on NPU with this PR. Let me fix the batch issue for Qwen2.5-Omni and Qwen3-Omni later.

[Bug]: I encountered an error when running Qwen2.5-omni-3B on the NPU using the vllm_ascend0.11.0rc1 framework according to the official documentation.

> sorry for the misleading docker file installation: https://docs.vllm.ai/projects/vllm-omni/en/latest/getting_started/installation/npu/#recommended > > I think you uv pip install vllm-omni directly for v0.11.0rc1 rather than build from source > > [@gcanlin](https://github.com/gcanlin) could...

[Bug]: I encountered an error when running Qwen2.5-omni-3B on the NPU using the vllm_ascend0.11.0rc1 framework according to the official documentation.

https://github.com/vllm-project/vllm-omni/pull/434 is fixing.

[Bug]: NPU，RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

Please checkout commit `9464e14` and use vllm-ascend v0.11.0rc2 and vllm v0.11.0. We're still upgrading vllm-ascend to v0.12.0rc1.

[Bug]: NPU，RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

> I'm use vllm-ascend v0.11.0rc1 and this error as follows： > > INFO 12-23 09:07:54 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available. WARNING...

[Roadmap]: preparing for v0.12.0 release

Request tasks: - Qwen2.5-Omni offline test have been working and almost been done in #168. - API - OpenAI API for image generation

[Clean] Remove the redundant decoding payloads logic

cc @tzhouam @R2-Y @hsliuustc0106 PTAL and add a ready tag to test all models. Thanks!

[Clean] Remove the redundant decoding payloads logic

> could you please post the test result before and after this commit? Of course. Update now.

[Clean] Remove the redundant decoding payloads logic

@hsliuustc0106 CI breaks because of `Gateway Timeout`. How can I fix it? Or could you please help retry it? ``` 2025-12-24T14:40:52Z] Installing collected packages: pip -- [2025-12-24T14:40:53Z] Successfully installed pip-25.3...

‹
1
2
3
4
5
6
›