ColossalAI
ColossalAI copied to clipboard
[BUG]: train_rm.py get lower acc!
🐛 Describe the bug
hello, here is a bug, similar with issues-3534
use default Anthropic/hh-rlhf dataset, pretrain_model: bigscience/bloom-1b1 batch_size: 1 max_epochs: 1 max_len: 512 loss_fn: log_sig
loss is random change acc is much lower than than the one reported in the readme.
using the following cmd:
here is the results:
Is there any advice?
Thanks!
Environment
No response
Has it been trained for one full epoch?
Yes, it is trained for one full epoch
Hi @Yutongamber Maybe it’s an inappropriate sh command. We have fixed it. Thanks. #3490 https://github.com/hpcaitech/ColossalAI/blob/main/applications/Chat/examples/train_rm.sh