Calibration Dataset: how to avoid computing loss on instructions?

Open RanchiZhao opened this issue 1 year ago • 0 comments

I would like to know about chat models. When I use AWQ for calibration, I do not want to compute the loss for the instructions, but only for the responses. I want to know how to handle this when inputting the calibration dataset. For example, how should I handle the labels and attention masks?

Jun 28 '24 03:06 RanchiZhao