Liger-Kernel icon indicating copy to clipboard operation
Liger-Kernel copied to clipboard

CI failure due to transformers VLM change

Open Tcc0403 opened this issue 6 months ago • 1 comments

🐛 Describe the bug

CI details: Qwen2VLConfig, monkey patch impl related

Text config is seperated out from the general config in transformers>=4.52.0. Qwen2VLRotaryEmbedding takes Qwen2VLTextConfig intead of Qwen2VLConfig now.

https://github.com/huggingface/transformers/pull/37268

We also need to check all VLMs monkey patch since most models are refactored into TextModel and VisionModel

blocked on: https://github.com/huggingface/transformers/issues/38331 (can walkaround by passing hidden_size and vocab_size to config dict directly though)

Reproduce

No response

Versions

liger_kernel==0.5.9 transformers>=4.52.0

Tcc0403 avatar May 22 '25 05:05 Tcc0403

video_processor(Qwen2VLVideoProcessor) is now a required input for Qwen2VLProcessor.

We might need to add torchvision to our dev dependency since it is required for videoprocessor.

Related PR: https://github.com/huggingface/transformers/pull/35206

Tcc0403 avatar May 26 '25 10:05 Tcc0403

fixed by #735 #738 #743 #755

Tcc0403 avatar Jun 13 '25 14:06 Tcc0403