LLaVA
LLaVA copied to clipboard
add llama3's prompt template to conversation.py
#1426
@KazutoshiShinoda can add preprocess_llama_3 func code? i will test it on prepare stage
what about prepocess during LazySupervisedDataset
Hi @KazutoshiShinoda, @awzhgw , @Jayantverma2 ,
I hope you are doing well. We have just released our project LLaVA++: Extending Visual Capabilities with LLaMA-3 and Phi-3
, which features LLaMA-3 and Phi-3-Mini based LLaVA models. Please have a look at this at LLaVA++.
- We have released the codes required to support both LLaMA-3 & Phi-3-Mini models in LLaVA framework. The chat formats and corresponding preprocess methods are available at our GitHub repo.
- We released all the checkpoints on Hugging Face
- On our GitHub repository we have provided
.py
files that needs to be replaced/added to official LLaVA repository to train and infer LLaMA-3 & Phi-3-Mini based models.
I hope this would be helpful. Please let me know if you have any questions. Thanks
@mmaaz60 In your implementation, I can see the following logic for preprocessing, but I don't quite understand why
round_len -= 1 when i > 0. Could you explain that a little bit?
for conversation, target in zip(conversations, targets):
total_len = int(target.ne(tokenizer.pad_token_id).sum())
rounds = conversation.split(conv.sep)
re_rounds = [conv.sep.join(rounds[:3])]
for conv_idx in range(3, len(rounds), 2):
re_rounds.append(conv.sep.join(rounds[conv_idx:conv_idx + 2]))
cur_len = 0
target[:cur_len] = IGNORE_INDEX
for i, rou in enumerate(re_rounds):
if rou == "":
break
parts = rou.split(sep)
if len(parts) != 2:
break
parts[0] += sep
if has_image:
round_len = len(tokenizer_image_token(rou, tokenizer)) + 1
instruction_len = len(tokenizer_image_token(parts[0], tokenizer))
else:
round_len = len(tokenizer(rou).input_ids) + 1
instruction_len = len(tokenizer(parts[0]).input_ids)
if i > 0:
round_len -= 1
instruction_len -= 1
target[cur_len: cur_len + instruction_len] = IGNORE_INDEX
cur_len += round_len