Hengyuan Zhang

Results 8 comments of Hengyuan Zhang

感觉不需要去判断need%2 == 1,不需要去考虑这个情况

请问这个 zuowen_epoch40 的模型,是官方放出来的CPM 模型还是你自己从头训练的呀?

这个是用来验证的哈,我已经找到了,用的是generate这个接口 可以参考这个页面https://github.com/RussianNLP/russian_paraphrasers/blob/master/russian_paraphrasers/paraphrasers/paraphraser_mt5.py | | hengyuan_blcu | | ***@***.*** | 签名由网易邮箱大师定制 On 09/24/2021 ***@***.***> wrote: 有的呀。也可以参考这个:https://huggingface.co/transformers/training.html trainer = Trainer( model=model, args=training_args, train_dataset=small_train_dataset, eval_dataset=small_eval_dataset, compute_metrics=compute_metrics, ) trainer.evaluate() — You are receiving this...

try it on cpu ? I can work top1 operation is non-differentiable, but the balance loss is based on logits of gating distribution and count num of tokens per expert,...

作者你好,请问金融的指令数据集未来会开源吗

同问,什么时候公开呢?