LLaVA-NeXT LLaMA3-8B video inference

LLaMA3-8B video inference

Open airogachev opened this issue 1 year ago • 0 comments

As llama3-llava-next-8b and LLaVA-NeXT-Video-7B-DPO seem to have the same interface, is it possible to make llama3-llava-next-8b process multiple frames of one video per single forward?

Basically, I don't get the idea of what makes LLaVA-NeXT-Video a processor for multiple frames and can't find the related code. According to the blog post processing patches versus frames is the difference, so the initial question arises.

May 15 '24 07:05 airogachev

LLaVA-NeXT LLaVA-NeXT copied to clipboard

LLaMA3-8B video inference

LLaVA-NeXT
LLaVA-NeXT copied to clipboard