alignment-handbook icon indicating copy to clipboard operation
alignment-handbook copied to clipboard

Why does the alignment-handbook account for user & system Inputs in loss calculation

Open xffxff opened this issue 2 years ago • 3 comments

I noticed that the alignment-handbook doesn't ignore the loss calculated from both the user and system inputs Based on my knowledge, many SFT choose to ignore these. I'm curious about the reasoning behind this difference.

xffxff avatar Nov 28 '23 06:11 xffxff

I'm curious on the official response here.

My guess would be:

  • Currently packing does not work with completion-only training in TRL's implementation, which makes training much slower for training on massive datasets
  • In my experience, completion-only training yielded worse performance on finetuning for new tasks, when evaluated on the new tasks specifically

FYI If you want to fork, you can use completion-only training with minimal changes.

nathan-az avatar Nov 29 '23 00:11 nathan-az

I'm curious on the official response here.

My guess would be:

  • Currently packing does not work with completion-only training in TRL's implementation, which makes training much slower for training on massive datasets
  • In my experience, completion-only training yielded worse performance on finetuning for new tasks, when evaluated on the new tasks specifically

FYI If you want to fork, you can use completion-only training with minimal changes.

Do u have any reference or evidence for worse performance on completion-only tuning for new tasks? I want to learn more!

MAOJIASONG avatar May 30 '24 06:05 MAOJIASONG

Do u have any reference or evidence for worse performance on completion-only tuning for new tasks? I want to learn more!

Nope, no references other than trying it on an internal use case and seeing much worse eval results. I didn't look much further into it and just went back to packing without completion-only.

I still linked the docs because I'd encourage interested parties to try it out :)

nathan-az avatar May 30 '24 07:05 nathan-az