LLaVA-NeXT
LLaVA-NeXT copied to clipboard
fix prepare_inputs_labels_for_multimodal in llava_arch
if image_idx not in video_idx_in_batch
, image_feature
will be added into new_image_features
repeatedly, which should be avoided. That's what this PR does.