LLaVA-NeXT issues

Request for LLaVA-OneVision Model & Data Specifications

Hello, We'd like to begin by expressing our sincere appreciation for your team's excellent work on LLaVA-OneVision and for making this powerful model publicly available. It is a fantastic contribution...

jungsunIm

non-meta parameter vs meta parameter

for vision_model.post_layernorm.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign...

ilikeyouyoulikeme

Training fails with tons of missing imports in llava_trainer.py

5

I'm trying to train a model but in `llava/train/llava_trainer.py` file. It has broken imports everywhere. I follow the installation in the Readme.md > conda create -n llava python=3.10 -y >...

peterwisu

Fix missing imports and code formatting of `llava/train/llava_trainer.py`

### Summary Resolve import errors encountered when running `scripts/train/pretrain_clip.sh` by adding the required imports and applying minor formatting updates in `llava/train/llava_trainer.py`. No functional changes intended beyond unblocking the training run....

naufalso

Setting different learning rate for different modules

The code provides two arguments: 1) mm_vision_tower_lr, and 2) mm_projector_lr to set the learning rate externally. However, this does not take effect. I speculate the reason is that the optimizer...

yufeixue-ai

How to process the ReCap data in parquet format in huggingface for OneVision Mid-Stage?

yxchng

Eager Attention Works with OneVision 0.5B but Not with 7B

I am trying to extract attention weights from the model and thus need to use `eager` implementation. The following code works; ```python # pip install git+https://github.com/LLaVA-VL/LLaVA-NeXT.git from llava.model.builder import load_pretrained_model...

varungupta31