course
course copied to clipboard
Use DataCollatorForCompletionOnlyLM in order to train LLM to follow instructions
In Chapter 11 of the course, in the file FineTuning with SFTTrainer (3.mdx), you explain how to fine-tune a DeepSeek model with SFTTrainer on an instruction dataset.
Why don't you use the DataCollatorForCompletionOnlyLM data collator in the SFTTrainer to avoid computing gradients and back-propagating on the user question tokens? The default data collator in the SFTTrainer is DataCollatorForLanguageModeling, which means that in your example the LLM will also learn to complete the user query, which is not intended as far as I understand.