Kiren Wang

Results 3 issues of Kiren Wang

按照博主的接口去请求,最后得到的obj_resp好像改了,麻烦博主看一下!

thanks for your work! It is very valuable! I would like to know how you got your conclusion about token routing, since input is affected by attention and rope, it...

Which article proposed In-batch debiased cross-entropy loss? Can you provide relevant literature?