LLaVA add llama3's prompt template to conversation.py

Apr 22 '24 15:04 KazutoshiShinoda

#1426

Apr 22 '24 15:04 KazutoshiShinoda

@KazutoshiShinoda can add preprocess_llama_3 func code? i will test it on prepare stage

Apr 23 '24 07:04 awzhgw

what about prepocess during LazySupervisedDataset

Apr 23 '24 13:04 Jayantverma2

Hi @KazutoshiShinoda, @awzhgw , @Jayantverma2 ,

I hope you are doing well. We have just released our project LLaVA++: Extending Visual Capabilities with LLaMA-3 and Phi-3, which features LLaMA-3 and Phi-3-Mini based LLaVA models. Please have a look at this at LLaVA++.

We have released the codes required to support both LLaMA-3 & Phi-3-Mini models in LLaVA framework. The chat formats and corresponding preprocess methods are available at our GitHub repo.
We released all the checkpoints on Hugging Face
On our GitHub repository we have provided .py files that needs to be replaced/added to official LLaVA repository to train and infer LLaMA-3 & Phi-3-Mini based models.

I hope this would be helpful. Please let me know if you have any questions. Thanks

Apr 26 '24 18:04 mmaaz60

@mmaaz60 In your implementation, I can see the following logic for preprocessing, but I don't quite understand why
round_len -= 1 when i > 0. Could you explain that a little bit?

    for conversation, target in zip(conversations, targets):
        total_len = int(target.ne(tokenizer.pad_token_id).sum())

        rounds = conversation.split(conv.sep)
        re_rounds = [conv.sep.join(rounds[:3])]
        for conv_idx in range(3, len(rounds), 2):
            re_rounds.append(conv.sep.join(rounds[conv_idx:conv_idx + 2]))
        cur_len = 0
        target[:cur_len] = IGNORE_INDEX
        for i, rou in enumerate(re_rounds):
            if rou == "":
                break

            parts = rou.split(sep)
            if len(parts) != 2:
                break
            parts[0] += sep

            if has_image:
                round_len = len(tokenizer_image_token(rou, tokenizer)) + 1
                instruction_len = len(tokenizer_image_token(parts[0], tokenizer))
            else:
                round_len = len(tokenizer(rou).input_ids) + 1
                instruction_len = len(tokenizer(parts[0]).input_ids)

            if i > 0:
                round_len -= 1
                instruction_len -= 1

            target[cur_len: cur_len + instruction_len] = IGNORE_INDEX

            cur_len += round_len

Jun 05 '24 09:06 pluswcm

LLaVA LLaVA copied to clipboard

add llama3's prompt template to conversation.py

LLaVA
LLaVA copied to clipboard