Multi-GPU training errors with peft
🐛 Describe the bug
When I try to use multi-gpu training with accelerate I get an error.
Code:
import trlx
from peft import LoraConfig, TaskType
from trlx.data.configs import (
ModelConfig,
OptimizerConfig,
SchedulerConfig,
TokenizerConfig,
TrainConfig,
TRLConfig,
)
from trlx.models.modeling_ppo import PPOConfig
config = TRLConfig(
train=TrainConfig(
seq_length=1024,
epochs=50,
total_steps=100000,
batch_size=1,
checkpoint_interval=1000,
eval_interval=200,
pipeline="PromptPipeline",
trainer="AcceleratePPOTrainer",
),
model=ModelConfig(model_path='gpt2',
num_layers_unfrozen=1,
# peft_config={"peft_type": "LORA", "r": 1, "lora_alpha": 32, "lora_dropout": 0.1},
),
tokenizer=TokenizerConfig(tokenizer_path='gpt2', truncation_side="right"),
optimizer=OptimizerConfig(name="adamw"),
scheduler=SchedulerConfig(name="cosine_annealing", kwargs={"T_max": 100000, "eta_min": 5.0e-6},),
method=PPOConfig(
name="PPOConfig",
num_rollouts=128,
chunk_size=16,
ppo_epochs=4,
init_kl_coef=0.1,
target=6,
horizon=10000,
gamma=1,
lam=0.95,
cliprange=0.2,
cliprange_value=0.2,
vf_coef=0.2,
scale_reward=None,
ref_mean=None,
ref_std=None,
cliprange_reward=10,
gen_kwargs={
"max_new_tokens": 50,
},
),
)
if __name__ == "__main__":
def reward_fn(samples, **kwargs):
return [0] * len(samples)
trainer = trlx.train(
reward_fn=reward_fn,
prompts=['dummy dataset'],
config=config,
)
Launch command:
CUDA_VISIBLE_DEVICES=0,1 debug=true accelerate launch --mixed_precision bf16 trlx_minimal.py
Error:
File "/home/olivia/experiments/cot_reliability/trlx_minimal.py", line 73, in <module>
trainer = trlx.train(
File "/home/olivia/miniconda3/envs/exps/lib/python3.9/site-packages/trlx/trlx.py", line 92, in train
trainer = get_trainer(config.train.trainer)(
File "/home/olivia/miniconda3/envs/exps/lib/python3.9/site-packages/trlx/trainer/accelerate_ppo_trainer.py", line 74, in __init__
if not hasattr(self.model, "frozen_head") and not self.model.peft_type:
File "/home/olivia/miniconda3/envs/exps/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1695, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'DistributedDataParallel' object has no attribute 'peft_type'
The error comes from these lines in accelerate_ppo_trainer.py:
self.model, self.opt, self.scheduler, rollout_loader = self.accelerator.prepare(
self.model, self.opt, self.scheduler, rollout_loader
)
self.store.clear_history() # Clear the rollout store
if not hasattr(self.model, "frozen_head") and not self.model.peft_type:
self.ref_model = self.get_arch(self.config)
self.model originally has a peft_type attribute set to None, but in multi-gpu mode it seems like the self.accelerator.prepare call wraps the model in a DistributedDataParallel which doesn't have this attribute.
We can get around this by storing the peft_type attribute from before accelerate.prepare and setting it afterwards. This makes the code run correctly.
However, even with this change, multi-gpu training does not work with using peft to implement LoRA.
If I uncomment the peft_config lines in the example script above and change num_layers_unfrozen to 1, then this seems to work correctly with single-gpu training. However, when I add a second GPU, then the script fails with an error saying that DistributedDataParallel has no attribute forward_hydra.
This problem can be fixed by removing all references to peft_type in accelerate_ppo_trainer.py. (This also makes the fix above unnecesary). When I do this it seems to be running correctly with LoRA on both GPUs. However, I am not familiar enough with this codebase to know if this fix introduces additional errors which are not obvious.
Which trlX version are you using?
trlx==0.7.0
Additional system and package information
python 3.9, transformers 4.35.0, accelerate 0.24.1, Ubuntu
Hi, I met the same issue on left_type. Did you solve this in the end?