燃 comments

Results 67 comments of

燃

vllm部署Qwen3_VL-8B-thinking无<think>标签

readme 中示例输出符合预期吗？

vllm部署Qwen3_VL-8B-thinking无<think>标签

> 我的示例输出：（缺少开头的``） @qibinlin 开头的 `` 在 chat tamplate 里面，属于 prompt，不会出现在 response 里。

时间和空间感知能力的代码实现在哪里？怎么感觉根qwen2-vl 实现都一样呢？还有对时间维度的绝对位置编码和动态帧的引入的代码在哪？

位置编码见这里： https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L1509

时间和空间感知能力的代码实现在哪里？怎么感觉根qwen2-vl 实现都一样呢？还有对时间维度的绝对位置编码和动态帧的引入的代码在哪？

> > 位置编码见这里： https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L1509 > > [@wulipc](https://github.com/wulipc) > > 引入了动态 FPS (每秒帧数)训练, 这个动态帧体现在哪里？ @cqray1990 FPS 是视频的采帧频率，训练和测试过程中，可以动态设置。https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_vl/processing_qwen2_5_vl.py#L142

FlashInfer CUTLASS MoE is available for EP but not enabled, consider setting VLLM_USE_FLASHINFER_MOE_FP16=1 to enable it.

We haven't thoroughly tested VLLM_USE_FLASHINFER_MOE_FP16 internally, so it hasn't been set as the default configuration. For more optimization techniques related to Expert Parallelism (EP), please refer to the community documentation:...

Expected System Specifications and Hosting Qwen2-VL-7B-Instruct on EC2

The vLLM is a good choice. Please refer to the [Deploy](https://github.com/QwenLM/Qwen2.5-VL?tab=readme-ov-file#deployment) section in the README document.

CUDA_VISIBLE_DEVICES=0,1,2 python web_demo_mm.py 页面上传图片后点submit 都报错

Hi，这个日志没有体现核心错误，该脚本目前只在 Qwen3 235A22 测试过，你可以发完整的日志我看下；如果模型太大无法运行，你可以留意下我们最近要发布的小尺寸模型，感谢对千问的支持。另外，qwen vl 2.5 的 web demo 你可以按照旧的文档进行配置： https://github.com/QwenLM/Qwen3-VL/blob/d2240f11656bfe404b9ba56db4e51cd09f522ff1/web_demo_mm.py

请问qwen3-vl是否有thinking_budget的参数，用来限制thinking的内容不要太长？

@XYZ-916 @nneowvoincee 目前暂时不支持限制 thinking 的长度，也不支持 thinking_budget 参数。有个思路是使用 logit processor 如果 thinking 长度达到预设置可以直接输出来实现，但目前 vLLM 还不支持 pre-request level 的 LPs，所以在 vLLM 暂时无法实现。

Bug Report: Text stops early or loops after omni-utils is installed

@elliotgao @Jun-Howie We identified that the issue is caused by the `get_input_positions` function. In pure text mode, it produces `position_ids` with shape `[n]`, while the correct shape should be `[3,...

Bug Report: Text stops early or loops after omni-utils is installed

@elliotgao @Jun-Howie the fix has been merged into the main branch of vLLM. Try it out and see if it fixes your problem. Thanks again for your feedback! https://github.com/vllm-project/vllm/pull/18526