HugNLP icon indicating copy to clipboard operation
HugNLP copied to clipboard

关于文本label直接复制input_ids的处理

Open Cooperx521 opened this issue 11 months ago • 1 comments

作者您好,在documents/pretraining/Causal LM for Continual Pre-training.md里面,有这样一句话输入时只需要直接将input_ids复制一份为label即可,麻烦问一下因为在计算loss的时候,label需要左移一位,那么这个操作是在哪一部分被完成的呢,是在trainer里面吗,可是trainer如何知道是causal loss呢

Cooperx521 avatar Mar 18 '24 15:03 Cooperx521

这部分操作是在模型的forward中实现。详见这里:https://github.com/HugAILab/HugNLP/blob/main/models/language_modeling/causal_lm.py 的122行

# Shift so that tokens < n predict n
shift_logits = lm_logits[..., :-1, :].contiguous()
shift_labels = labels[..., 1:].contiguous()
# print("shift_labels=", shift_labels)
# Flatten the tokens
loss_fct = CrossEntropyLoss()
loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))

wjn1996 avatar Apr 12 '24 10:04 wjn1996