LLaVA-NeXT
LLaVA-NeXT copied to clipboard
Hi, thanks for your interest in this project! Since this repository is maintained by multiple paper authors with limited bandwidth, we may not be able to review or respond to...
Hi, is the prompt version in the pre-training stage of onevision (see https://github.com/LLaVA-VL/LLaVA-NeXT/blob/main/scripts/train/pretrain_clip.sh) set to plain on purpose? Should it not be qwen_2? If it is done on purpose could...
when i use the llavavideo to inference, load the model get these error:
Model: lava-onevision-qwen2-0.5B-ov 输出的grounding结果看起来有非常明显的偏差,而且看训练集好像grounding的数据比例非常少?
Thanks for your work! I've downloaded LLaVA-Video-178k dataset and I want to pick several **specific types of questions** for my research, according to Fig 3 in your paper. It seems...
When passing both single-image and multi-image inputs in the same batch, the following error occurs: ``` RuntimeError: Tensors must have same number of dimensions: got 2 and 1 ``` Is...
@Luodian ### TL;DR **I'm trying to fine-tune LLaVA-Next OneVision, and when I use the local weights of `google/siglip-so400m-patch14-384`, I get the following shape mismatch error:** > `RuntimeError: size mismatch for...
Hello, I am trying to run the lmms-lab/LLaVA-NeXT-Video-32B-Qwen model on an A100-40GB GPU. However, I encounter an OOM issue when loading the model in its default configuration. To address this,...
I tried to run the `video_demo.sh` script on my own video. I only modified the video path with out changing any other parameters: ``` bash scripts/video/demo/video_demo.sh lmms-lab/LLaVA-NeXT-Video-7B-DPO vicuna_v1 32 2...