Yunfan Shao

Results 35 comments of Yunfan Shao

Could you please provide a small piece of glove that cause the bug as an example at here? As we haven't met such issue when using Glove. Thanks.

推荐尝试一些新出的代码框架,使用最新的训练技术,比如flash attention

CPT的生成能力和同参数量BART差不多,但是NLU能力CPT要好很多

是的,denoising我们follow了BART的设置,只使用text infilling,没有加入insert和rotate。BART论文中表示这样效果最好

是第一种,太长的文章会被分成多个1024。短的会padding到1024