Canlin Guo

Results 56 comments of Canlin Guo

> the same error `ImportError: cannot import name 'Qwen2_5_VisionRotaryEmbedding' from 'vllm.model_executor.models.qwen2_5_vl'` Please use vllm-omni v0.11.0rc1 with vllm-ascend v0.11.0rc2 for now. I'm trying to upgrade to vllm-omni with vllm-ascend v0.12.0rc1, which...

After #458 merge, this PR will break. Need to rebase on main and adapt it.

@hsliuustc0106 Could you help add a ready tag to ensuring the patch doesn't effect GPU? And if CI can pass, I think we can merge this PR first and more...

@hsliuustc0106 @david6666666 CI has passed now. Could we merge it?

> [qwen2_5_omni.yaml](https://github.com/user-attachments/files/24387986/qwen2_5_omni.yaml) (Note we have to update the qwen2_5_omni.yaml file to only use device '0', since the existing qwen2_5_omni.yaml is for to run on 2 cards) Moving all stages into...

For better analyzing the Qwen3-Omni performance, I have implemented one draft version in #553. Not sure if missing something, feel free to give some suggestions.

vLLM-Omni v0.11.0rc1 haven't supported Qwen3-Omni on NPU. We will support it in the next version. But you could try it in the main branch. NOTE: The image of vllm-ascend v0.12.0rc1...

From your provided log, it's weird that the code branch go into GPUWorker even if your device is NPU. I guess that it's the environment problem. Notice that the env...

Is this the common issue both on GPU and NPU? Or only on NPU?

@Isotr0py Could you please help review? Actually, I'm not very familiar with `SharedFusedMoE` and afraid that I could bring some hidden bugs into modeling.