DeepSpeedExamples
DeepSpeedExamples copied to clipboard
use bloom-350m to train reward model in step2
I want to train bloom_350m in chinese dataset, and run run_350m.sh, change the model_name_or_path. But the loss is nan, how should I solve it? Is the argument "num_padding_at_beginning" cause this?