OpenChatKit icon indicating copy to clipboard operation
OpenChatKit copied to clipboard

Why instruction tuning calculate whole sentence loss?

Open sxthunder opened this issue 1 year ago • 0 comments

I noticed that OIG dataset adds human and bot tag in each sample. In your code, you directly pack samples to max seq length and calculate cross entropy on whole sentence. Will this make the model output human, bot tag and not knowing when to stop? Does only calculate the last bot response loss be more suitable?

sxthunder avatar Mar 14 '23 12:03 sxthunder