DeepSpeedExamples use bloom-350m to train reward model in step2

use bloom-350m to train reward model in step2

Open 70557dzqc opened this issue 1 year ago • 1 comments

I want to train bloom_350m in chinese dataset, and run run_350m.sh, change the model_name_or_path. But the loss is nan, how should I solve it? Is the argument "num_padding_at_beginning" cause this?

Apr 19 '23 09:04 70557dzqc

sorry, it's bloom-560m, not bloom_350m

Apr 19 '23 10:04 70557dzqc

@panxb833 Hi! I met the same problem as you. Do you know how to solve it?

Apr 25 '23 07:04 LuciusMos

@panxb833 @LuciusMos I met the same problem as you. Do you know how to solve it?

May 24 '23 17:05 Ablustrund

working for me, what's your error?

May 27 '23 04:05 xlinsz

same error

May 30 '23 06:05 scarydemon2

same error

I have fixed this problem. ref:https://github.com/microsoft/DeepSpeedExamples/issues/571

Jun 02 '23 09:06 scarydemon2

@scarydemon2 I have the same problem. Do we need to modify the code in reward_model forward_value function?

Aug 10 '23 11:08 robotsp

DeepSpeedExamples DeepSpeedExamples copied to clipboard

use bloom-350m to train reward model in step2

DeepSpeedExamples
DeepSpeedExamples copied to clipboard