Raushan Turganbay comments

Results 117 comments of


                                            Raushan Turganbay

[Question] Is this slated for release in the Transformers library?

@Luodian Hey again! Just wanted to check in and see if you had any updates on this. Thanks!

[Question] Is this slated for release in the Transformers library?

@Luodian I see, thanks. Implementing the current state of "llava-next-video" sounds good for me. The model shows very good performance on videos, and we can give it more visibility on...

[Question] Is this slated for release in the Transformers library?

The model is added to Transformers and will be part of the next 4.42 release! Please find all checkpoints here: https://huggingface.co/collections/llava-hf/llava-next-video-6666a9173a64c7052930f153 🤗

LLaVa-NeXT-Video is added to 🤗 Transformers!

@HarperGG please take a look at the colab notebook for inference, there are code snippets for batch inference near the end. @zhengrongz yes, the config was missing rope scaling factor....

LLaVa-NeXT-Video is added to 🤗 Transformers!

@zhengrongz you can strip off the prompt text based on input length similar to below: ```python inputs = processor(prompt, videos=clip, return_tensors="pt") input_length = inputs.input_ids.shape[[-1] output = model.generate(**inputs) output_stripped = output[:,...

LLaVa-NeXT-Video is added to 🤗 Transformers!

@Namzakku hey! yes, it should work with any inputs actually. Can you show the error you encountered and the minimal code to reproduce the error?

LLaVa-NeXT-Video is added to 🤗 Transformers!

Ah I see, forgot that llava-next processes images in patches.and each image can contain different number of patches, unlike videos where the number of frames is fixed. A solution can...

LLaVa-NeXT-Video is added to 🤗 Transformers!

@ameeramer right, in the last release we made some changes in backbone LLM which caused errors on LLaVA-NeXT-Video. Made [a PR](https://github.com/huggingface/transformers/pull/32527) to fix it and prob will do a patch...

LLaVa-NeXT-Video is added to 🤗 Transformers!

@Namzakku Hmm, from the device-map there doesn't seem to be anything to cause device mismatch errors. Can you share the full traceback to see where exactly are tensors on different...

LLaVa-NeXT-Video is added to 🤗 Transformers!

@Namzakku Yes, it supports multi-turn conversations just the way you have it in the example. You just need to pass in the convo to `processor.apply_chat_template()` and you'll get a correct...