Liger-Kernel
Liger-Kernel copied to clipboard
CI failure due to transformers VLM change
🐛 Describe the bug
CI details: Qwen2VLConfig, monkey patch impl related
Text config is seperated out from the general config in transformers>=4.52.0.
Qwen2VLRotaryEmbedding takes Qwen2VLTextConfig intead of Qwen2VLConfig now.
https://github.com/huggingface/transformers/pull/37268
We also need to check all VLMs monkey patch since most models are refactored into TextModel and VisionModel
blocked on: https://github.com/huggingface/transformers/issues/38331 (can walkaround by passing hidden_size and vocab_size to config dict directly though)
Reproduce
No response
Versions
liger_kernel==0.5.9 transformers>=4.52.0
video_processor(Qwen2VLVideoProcessor) is now a required input for Qwen2VLProcessor.
We might need to add torchvision to our dev dependency since it is required for videoprocessor.
Related PR: https://github.com/huggingface/transformers/pull/35206
fixed by #735 #738 #743 #755