DeepSpeedExamples
DeepSpeedExamples copied to clipboard
Using LLaMA in reward model training
Hi,
I have encounter the TypeError: LlamaModel.forward() got an unexpected keyword argument 'head_mask' error when training the LLaMA-7B model in step 2 reward model training.
I was wondering if the head_mask is used at all in training the reward model?
Also, is there a quick fix for this error?
Many thanks
The LLaMA model does not have the "head_mask" argument, which you can remove in the utils/model/reward_model.py.
The LLaMA model does not have the "head_mask" argument, which you can remove in the utils/model/reward_model.py.
Thanks! So the reward model training doesn't use the head_mask at all then?
The LLaMA model does not have the "head_mask" argument, which you can remove in the utils/model/reward_model.py.
When i remove “head_mask”
I get error:
Traceback (most recent call last):
File "main.py", line 357, in
When I remove the head_mask, I am met with the same divergence_id assertion error as in https://github.com/microsoft/DeepSpeedExamples/issues/338
I modified the following code and it worked # self.config.n_embd = self.config.hidden_size if hasattr( # self.config, "hidden_size") else self.config.n_embd # todo self.config.n_embd = self.config.vocab_size if hasattr( self.config, "vocab_size") else self.config.n_embd
I modified the following code and it worked # self.config.n_embd = self.config.hidden_size if hasattr( # self.config, "hidden_size") else self.config.n_embd # todo self.config.n_embd = self.config.vocab_size if hasattr( self.config, "vocab_size") else self.config.n_embd
Thx for your solution, after modifying the code according to your suggestion, I finally make it also work on qwen