Yang Fan

Results 64 comments of Yang Fan

> Thanks for implementing this (and sorry for the delayed response)! Since this PR not only introduces a new modality (video) but also involves the first model to accept multiple...

Hi @DarkLight1337 @ywang96 , I have updated this PR based on your review comments, please check it again. I also add some notes about multiple modalities in the PR overview.

> @fyabc Hi, can this patch support mutiple images in one prompt like follows: > > ``` > Compute the value of the expression in the image below \nby using...

> I did some local testing on this PR and it's working well for both `.jpg` and `.mp4` inputs, on TP=1 and TP=2. > > Note that I did run...

> The use of `qwen-vl-utils` is quite different from the existing models which fully rely on the `AutoProcessor` from HuggingFace. Is there a particular reason why the preprocessing logic for...

> > I happened to notice something while following this PR. I've merged it locally to run some tests, and during testing, I encountered a strange issue where, after several...

> support qwen2 vl awq model? @seanzhang-zhichen Yes, this PR support AWQ model. You can check [this issue](https://github.com/QwenLM/Qwen2-VL/issues/20) for more details.

> I found that GPTQ quantization will prompt the following error. If I skip these weight readings according to the qwen2 code, it will run. Is it reasonable to merge...

> Huge thanks for the PR! We are very happy to support this model very soon :) > > I just left some comments mostly requiring some clarifications about mrope....

Hi @DarkLight1337 @ywang96 @WoosukKwon , I have updated this PR according to review comments, please check it again: 1. Add `xformers` backend of `Qwen2VisionAttention`, remove the dependency on `flash-attn`. 2....