DeepSpeedExamples
DeepSpeedExamples copied to clipboard
use bloom-350m to train reward model in step2
I want to train bloom_350m in chinese dataset, and run run_350m.sh, change the model_name_or_path. But the loss is nan, how should I solve it? Is the argument "num_padding_at_beginning" cause this?
sorry, it's bloom-560m, not bloom_350m
@panxb833 Hi! I met the same problem as you. Do you know how to solve it?
@panxb833 @LuciusMos I met the same problem as you. Do you know how to solve it?
working for me, what's your error?
same error
same error
I have fixed this problem. ref:https://github.com/microsoft/DeepSpeedExamples/issues/571
@scarydemon2 I have the same problem. Do we need to modify the code in reward_model forward_value
function?