LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

Results 315 LLaVA-NeXT issues
Sort by recently updated
recently updated
newest added

As llama3-llava-next-8b and LLaVA-NeXT-Video-7B-DPO seem to have the same interface, is it possible to make llama3-llava-next-8b process multiple frames of one video per single forward? Basically, I don't get the...

有关于第二阶段 790K 训练数据的说明吗?包含哪些数据?或者有开放这些数据的计划吗?

The [announcement blog post](https://llava-vl.github.io/blog/2024-04-30-llava-next-video/) indicates inference can be done with sglang, but attempting to load the 7b model with the sglang backend: ``` python -m sglang.launch_server --model-path ~/models/lmms-lab_LLaVA-NeXT-Video-7B-DPO --port 30000...

To my knowledge, the videos in NExTQA dataset are relatively short, with an average video length of 44 seconds, and there is a noted static bias[1] in the ActivityNet QA...

The [checkpoint link](https://huggingface.co/collections/lmms-lab/llava-next-6623288e2d61edba3ddbf5ff) for LLaVA-NeXT (Stronger) seems to be broken

Thanks for your work on this! When will fine-tuning scripts be made available?

Hello. Thank you for your great works. I faced the issue that File "/home3/user/mllm/LLaVA_NeXT/llava/mm_utils.py", line 377, in __call__ if output_ids[0, -keyword_id.shape[0] :] == keyword_id: RuntimeError: Boolean value of Tensor with...

Hi, was just testing to see if I could reform the same results from your demo as in an import code. I was attempting to prompt two images and then...

Thank you for the great job! I am just a bit curious about the computing resources used for LLaVA-NeXT.