LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

LLaMA3-8B video inference

Open airogachev opened this issue 1 year ago • 0 comments

As llama3-llava-next-8b and LLaVA-NeXT-Video-7B-DPO seem to have the same interface, is it possible to make llama3-llava-next-8b process multiple frames of one video per single forward?

Basically, I don't get the idea of what makes LLaVA-NeXT-Video a processor for multiple frames and can't find the related code. According to the blog post processing patches versus frames is the difference, so the initial question arises.

airogachev avatar May 15 '24 07:05 airogachev