LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

LLaVA-NeXT-Interleave Training Details

Open friedrichor opened this issue 1 year ago • 3 comments

Hello. Thanks for your excellent work!

Earlier, I reproduced LLaVA-NeXT-Image training and got the desired performance, and I am now trying to reproduce LLaVA-NeXT-Interleave training. I would like to inquire about the details of LLaVA-NeXT-Interleave's training.

What are the values of image_aspect_ratio and mm_patch_merge_type? I notice that the config.json within lmms-lab/llava-next-interleave-qwen-7b has the setting

"image_aspect_ratio": "pad",
"mm_patch_merge_type": "flat",

Are the setting the same for training? Since the training data has some single-image data, I'm not sure whether single-image has to do AnyRes.

friedrichor avatar Jul 17 '24 06:07 friedrichor