DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Llama2 as actor using zero_stage3

Open George-Chia opened this issue 2 years ago • 1 comments

Hello! Did anyone meet the following bug when using zero_stage3 for Lllama2? step3_rlhf_finetuning/rlhf_engine.py:61 in init │ │ │ │ 58 │ │ self.num_total_iters = num_total_iters │ │ 59 │ │ self.tokenizer = tokenizer │ │ 60 │ │ │ │ ❱ 61 │ │ self.actor = self._init_actor(actor_model_name_or_path=actor_model_name_or_path)

AttributeError: 'LlamaAttention' object has no attribute 'rope_theta'.

Note that OPT works, and using zero_stage2 also works.

George-Chia avatar Nov 18 '23 09:11 George-Chia

Same error with transformers==0.32.0. After updating transformers==0.34.0, the error is gone. FYI.

Jeayea avatar Dec 13 '23 17:12 Jeayea