LLaVa-NeXT/Fine_tune_LLaVaNeXT_on_a_custom_dataset_(with_PyTorch_Lightning).ipynb fails at training
ValueError Traceback (most recent call last)
24 frames /usr/local/lib/python3.10/dist-packages/transformers/models/llava_next/modeling_llava_next.py in _merge_input_ids_with_image_features(self, image_features, feature_lens, inputs_embeds, input_ids, attention_mask, position_ids, labels, image_token_index, ignore_index) 541 total_num_special_image_tokens = torch.sum(special_image_token_mask) 542 if total_num_special_image_tokens != num_images: --> 543 raise ValueError( 544 f"Number of image tokens in input_ids ({total_num_special_image_tokens}) different from num_images ({num_images})." 545 )
ValueError: Number of image tokens in input_ids (0) different from num_images (1).
this error appears only after fixing another error concerning the chat_template:
in the collate functions:
chat_template = (
"{% if messages[0]['role'] == 'instruction' %}"
"Instruction: {{- messages[0]['content'] }}\n"
"{% set messages = messages[1:] %}"
"{% endif %}"
"{% for message in messages %}"
"Question:"
"{% for line in message['query'] %}"
"{% if line['type'] == 'text' %}"
"{{- line['text'] }}"
"{% elif line['type'] == 'image' %}"
"{{ '
text_prompt = processor.tokenizer.apply_chat_template(conversation, chat_template=chat_template, add_generation_prompt=True)
https://github.com/huggingface/transformers/issues/32303
Hey @7AtAri did you figure out the cause of error?