daegon Yu
daegon Yu
Additionally, I have a question. When learning a decoder model, I understand that when the Instruction part is input to the model, only the Response part is learned by calculating...
Oh this is what I was looking for. thank you!
One thing I'm wondering about while researching this is, is it okay to assume that using DataCollatorForCompletionOnlyLM(response_template, tokenizer=tokenizer) and using DataCollatorForSeq2Seq(tokenizer = tokenizer) with train_on_responses_only( trainer, #instruction_part = "user\n\n", response_part...
Is this issue resolved? I am also training LoRA using the unsloth/gemma2-9b-it model, and when I resume training, I find that the log is not continuing the checkpoint reward. 