liyulin

Results 2 issues of liyulin

请问在生成response的时候,这里为什么要在加上后边50个,最后50个数据不就重复了吗 ![image](https://github.com/GanjinZero/RRHF/assets/85541451/67173870-ab99-4d74-9e64-a07188732620)

after a loss backward and optimizer step, then forward the embedding layer output hidden states become inf and loss is nan.