Use DataCollatorForCompletionOnlyLM in order to train LLM to follow instructions

Open xavier-owkin opened this issue 9 months ago • 0 comments

In Chapter 11 of the course, in the file FineTuning with SFTTrainer (3.mdx), you explain how to fine-tune a DeepSeek model with SFTTrainer on an instruction dataset.

Why don't you use the DataCollatorForCompletionOnlyLM data collator in the SFTTrainer to avoid computing gradients and back-propagating on the user question tokens? The default data collator in the SFTTrainer is DataCollatorForLanguageModeling, which means that in your example the LLM will also learn to complete the user query, which is not intended as far as I understand.

Mar 20 '25 16:03 xavier-owkin