
Results 3 issues of songt

This is a Goood work! But, I want to find out: Dose this work for mulit label classification? Such as: BCELoss in pytorch. THANKS.

您好: 1、RoBERTa pair的预训数据的构造形式是什么样的?和普通的RoBERTa有什么差别? 2、后续会开放在clue vocab上预训练的,base版的BERT及RoBERTa模型权重吗? 谢谢!

Optimize the preprocess function of SFT to make the logic of the label mask clearer. 优化SFT的preprocess函数,使得label mask的逻辑更清晰,并增加部分注释