LLaVA-NeXT
LLaVA-NeXT copied to clipboard
Conversion of checkpoints to hf format
Hi, just wanted to share this conversion script as part as a PR to integrate LLava-onevision into the transformers package: https://github.com/zucchini-nlp/transformers/blob/llava-onevision/src/transformers/models/llava_onevision/convert_llava_onevision_weights_to_hf.py
It works well for the original llava-onevision checkpoints, and I adapted it for my own checkpoints, in case anyone is interested. I was wondering why this conversion script is not shared in this repo ?
how to convert local llava-ov weights (after lora merged) to huggingface version?
Thank you for suggesting this integration.
I checked out the default lmms-lab/llava-onevision-qwen2-0.5b-ov and it was converted to the hf version and was able to do a forward pass, generate, and batch generate.
However in my case it missed the assertions, i.e
torch.allclose and because it didn't pass that the generated texts compared to the expected ones has slight differences.
Did you encounter this? Any ideas how to mitigate that? @Luodian can also hop in for any suggestions.
这是来自QQ邮箱的假期自动回复邮件。 您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
Thanks for sharing. However, I cannot even pass https://github.com/zucchini-nlp/transformers/blob/llava-onevision/src/transformers/models/llava_onevision/convert_llava_onevision_weights_to_hf.py#L216.
The average diff on each pixel is about 0.002
这是来自QQ邮箱的假期自动回复邮件。 您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。