peft
peft copied to clipboard
XLoRA: training issues, Gradients will be None
I installed PEFT from source. And use the latest versions of Transformers and TRL. I passed the XLoRA model to TRL but the training doesn't seem to work (training loss doesn't decrease and validation loss remains constant). I get this warning: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
I load Llama 3.1 (without quantization) and then run this code:
adapters = dict()
adapters["0"] = './adapter1/'
adapters["1"] = './adapter2/'
peft_config = XLoraConfig(
task_type=TaskType.CAUSAL_LM,
peft_type=PeftType.XLORA,
hidden_size=model.config.hidden_size,
xlora_depth=8,
adapters=adapters,
xlora_size=2048,
layerwise_scalings=True,
xlora_dropout_p=0.2
)
xlora_model = get_peft_model(model, peft_config)
training_arguments = SFTConfig(
output_dir="./output/",
optim="paged_adamw_8bit",
per_device_train_batch_size=2,
gradient_accumulation_steps=16,
save_strategy="epoch",
log_level="debug",
logging_steps=1,
learning_rate=1e-5,
bf16 = True,
num_train_epochs=1,
warmup_ratio=0.1,
lr_scheduler_type="linear",
dataset_text_field="text",
max_seq_length=512,
)
trainer = SFTTrainer(
model=xlora_model,
train_dataset=ds,
tokenizer=tokenizer,
args=training_arguments,
)
trainer.train()
I also observed another bug: The adapters must be named "0", "1", etc in the adapters dict() otherwise training won't start and will say that the adapters don't exist.
Maybe @EricLBuehler can help with this?