zaifeiyang

Results 2 comments of zaifeiyang

另外貌似文档中对send的描述也不对,实际使用时只能返回dict,不能返回tensor

> 3\. has > @haotian-liu Excuse me, I replaced an llm model and modified the code for training. The initial loss during pretrain is about 5.5, and it is still...