Yang Fan

Results 64 comments of Yang Fan

> I found that vllm's text model directly reuses Qwen2 code. In hf code, the text model position encoding function apply_multimodal_rotary_pos_emb of qwen2vl and the apply_rotary_pos_emb code of qwen2 are...

> The VLM part looks good to me overall. Just a few things left to do: > > * Could you add a test case (under `tests/`) to verify that...

> 咨询了我们的同事 @fyabc ,vllm openai api server多图支持是今天的vllm版本刚刚更新的(而而阿里云dashscope的模型服务一直支持多图),需要等我们的同事更新下docker。 > > @fyabc 更新好了docker后,如方便的话,辛苦在此issue说明下已更新。 @yunfucheng 您好,docker镜像已更新,支持多图推理,注意需要在启动`vllm.entrypoints.openai.api_server`时加上`--limit-mm-per-prompt image=10`类似的参数修改默认的每条请求图像数量最大值。 此外还需注意,目前vllm openai api server暂时还不支持视频输入,该功能目前正在开发中。 更多细节可参考[此处](https://github.com/QwenLM/Qwen2-VL?tab=readme-ov-file#notes) @JianxinMa

@ebsmothers Hi, thank you for you comments! Qwen2.5 only need to update tokenizer with new special tokens and chat template. I will take a look at this.