peft XLoRA: training issues, Gradients will be None

XLoRA: training issues, Gradients will be None

Open benjamin-marie opened this issue 6 months ago • 9 comments

I installed PEFT from source. And use the latest versions of Transformers and TRL. I passed the XLoRA model to TRL but the training doesn't seem to work (training loss doesn't decrease and validation loss remains constant). I get this warning: UserWarning: None of the inputs have requires_grad=True. Gradients will be None

I load Llama 3.1 (without quantization) and then run this code:

adapters = dict()
adapters["0"] = './adapter1/'
adapters["1"] = './adapter2/'

peft_config = XLoraConfig(
  task_type=TaskType.CAUSAL_LM,
  peft_type=PeftType.XLORA,
  hidden_size=model.config.hidden_size,
  xlora_depth=8,
  adapters=adapters,
  xlora_size=2048,
  layerwise_scalings=True,
  xlora_dropout_p=0.2
)

xlora_model = get_peft_model(model, peft_config)

training_arguments = SFTConfig(
        output_dir="./output/",
        optim="paged_adamw_8bit",
        per_device_train_batch_size=2,
        gradient_accumulation_steps=16,
        save_strategy="epoch",
        log_level="debug",
        logging_steps=1,
        learning_rate=1e-5,
        bf16 = True,
        num_train_epochs=1,
        warmup_ratio=0.1,
        lr_scheduler_type="linear",
        dataset_text_field="text",
        max_seq_length=512,
)

trainer = SFTTrainer(
        model=xlora_model,
        train_dataset=ds,
        tokenizer=tokenizer,
        args=training_arguments,
)

trainer.train()

I also observed another bug: The adapters must be named "0", "1", etc in the adapters dict() otherwise training won't start and will say that the adapters don't exist.

Maybe @EricLBuehler can help with this?

Aug 18 '24 10:08 benjamin-marie

peft peft copied to clipboard

XLoRA: training issues, Gradients will be None

peft
peft copied to clipboard