OpenChatKit Why instruction tuning calculate whole sentence loss?

Why instruction tuning calculate whole sentence loss?

Open sxthunder opened this issue 2 years ago • 0 comments

I noticed that OIG dataset adds human and bot tag in each sample. In your code, you directly pack samples to max seq length and calculate cross entropy on whole sentence. Will this make the model output human, bot tag and not knowing when to stop? Does only calculate the last bot response loss be more suitable?

Mar 14 '23 12:03 sxthunder

OpenChatKit OpenChatKit copied to clipboard

Why instruction tuning calculate whole sentence loss?

OpenChatKit
OpenChatKit copied to clipboard