Thewillman

Results 4 comments of Thewillman

> I couldn't solve it, still have this problem. Have you solved this question? I use supconloss for my dataset for batchsize=128 and loss don't decrease

The rewards accuracies just float around 0.5, which means the chosen rewards in some steps can smaller than the rejected rewards

> Sorry, but despite my best efforts I can't understand your question. You're talking about similar prompts in a list, about modifying the codebase without providing us with your modifications,...

> > Sorry, but despite my best efforts I can't understand your question. You're talking about similar prompts in a list, about modifying the codebase without providing us with your...