Liang Ding
Liang Ding
+1 looking forward the searching code urgently~
@CyndxAI Hi, Rico has already implemented this feature into [subword-nmt](https://github.com/rsennrich/subword-nmt)
@zhyongquan Anormaly Detection _12 Jan. 2019_ **translation version 0.1 finished**
@zhyongquan Recurrent Neural Networks
same doubt with you, maybe it should alter to d_model ** -0.5, anyway, I will conduct experiments of these two methods.
expecting more information too
Hi Emasoft, thanks for your kind interest in our report. Actually, our original intention is to maximize the translation performance of ChatGPT, and in this way, we give several pieces...
Welcome to try our proposed strategies in GPT4, and any feedbacks are welcome to post here or drop us an email. Me ([email protected]) and the first author ([email protected]).
您好!很棒的工作~ 不过我有个疑问哈。WMT14 En-De 4.5M训练集 from-scratch训练,在newstest19测试的结果大约是36\~37。虽然mbart原文没有这个setting,但是有同规模下的WMT17 En-Lv,4.5M这个规模使用mbart是会显著提升的。也就是说应该大于37~ 为何文中w/ mbart仅仅30.5 w/ mRASP也才35.2? 希望能解答