JingerAI

Results 5 comments of JingerAI

@guijuzhejiang The author's suggestion is that both models belong to the same model family, as confirmed in the paper "instructgpt."

@xiangrongzeng Will the project support GLM? It is a benefit for the Chinese.

Otherwise, is it necessary to load weights from SFT in step 2? In this project, the RW model only starts from the pre-trained model, which is different from instructGPT.

@yaozhewei You are right, many thanks. Additionally, in this project, it appears that the KL penalty is added starting from the last token. Does this approach make sense in light...

> @JingerAI please create another issue so we can separate different problems. @panganqi and others, please try the latest branch to see if the accuracy problem is resolved. Also, please...