DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Using LLaMA in reward model training

Open YingHH1 opened this issue 1 year ago • 6 comments

Hi,

I have encounter the TypeError: LlamaModel.forward() got an unexpected keyword argument 'head_mask' error when training the LLaMA-7B model in step 2 reward model training.

I was wondering if the head_mask is used at all in training the reward model?

Also, is there a quick fix for this error?

Many thanks

YingHH1 avatar Apr 19 '23 05:04 YingHH1

The LLaMA model does not have the "head_mask" argument, which you can remove in the utils/model/reward_model.py.

jose77 avatar Apr 19 '23 07:04 jose77

The LLaMA model does not have the "head_mask" argument, which you can remove in the utils/model/reward_model.py.

Thanks! So the reward model training doesn't use the head_mask at all then?

YingHH1 avatar Apr 19 '23 07:04 YingHH1

The LLaMA model does not have the "head_mask" argument, which you can remove in the utils/model/reward_model.py.

When i remove “head_mask”

I get error: Traceback (most recent call last): File "main.py", line 357, in main() File "main.py", line 311, in main reward_score, acc = evaluation_reward(rm_model, eval_dataloader) File "main.py", line 257, in evaluation_reward outputs = model(**batch) File "/home/kemove/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/home/kemove/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/kemove/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1695, in forward loss = self.module(*inputs, **kwargs) File "/home/kemove/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1208, in _call_impl result = forward_call(*input, **kwargs) File "/home/kemove/fkb/project/deepspeed/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/model/reward_model.py", line 202, in forward rewards = self.v_head(hidden_states).squeeze(-1) File "/home/kemove/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1208, in _call_impl result = forward_call(*input, **kwargs) File "/home/kemove/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (4096x32000 and 4096x1)

qinqinqaq avatar Apr 19 '23 10:04 qinqinqaq

When I remove the head_mask, I am met with the same divergence_id assertion error as in https://github.com/microsoft/DeepSpeedExamples/issues/338

YingHH1 avatar Apr 19 '23 10:04 YingHH1

I modified the following code and it worked # self.config.n_embd = self.config.hidden_size if hasattr( # self.config, "hidden_size") else self.config.n_embd # todo self.config.n_embd = self.config.vocab_size if hasattr( self.config, "vocab_size") else self.config.n_embd

qinqinqaq avatar Apr 20 '23 02:04 qinqinqaq

I modified the following code and it worked # self.config.n_embd = self.config.hidden_size if hasattr( # self.config, "hidden_size") else self.config.n_embd # todo self.config.n_embd = self.config.vocab_size if hasattr( self.config, "vocab_size") else self.config.n_embd

Thx for your solution, after modifying the code according to your suggestion, I finally make it also work on qwen

zhouliang-yu avatar May 16 '24 07:05 zhouliang-yu