unsloth
unsloth copied to clipboard
Use With AutoModelForCausalLMWithValueHead
Thanks for your great work developing Unsloth, it's a critical tool reducing the cost of fine-tuning to less than half!
My question is, within trl, reinforcement learning requires that the model have a value head (AutoModelForCausalLMWithValueHead) for value function estimation.
Are you open to official support for this type of model?
https://github.com/huggingface/trl/blob/main/trl/models/modeling_value_head.py#L61
Thanks for the kind words! Hm in theory you can replace the Causal head with other heads, and it should still work - I suggest doing this after all optims are applied
For the code segment below I get the traceback which follows it
def get_unsloth_model(base_model_name)
model, _ = FastLanguageModel.from_pretrained(
model_name=base_model_name,
max_seq_length=2048,
load_in_4bit=True,
)
return FastLanguageModel.get_peft_model(
model,
target_modules=[
"q_proj", "v_proj", "k_proj", "o_proj", # attention (self_attn)
"gate_proj", "down_proj", "up_proj", # FFN (mlp)
],
r=16,
lora_alpha=64,
lora_dropout = 0,
bias = "none",
use_gradient_checkpointing = True,
)
model = AutoModelForCausalLMWithValueHead.from_pretrained(get_unsloth_model())
...
ppo_trainer.step(...)
File "/root/ppot.py", line 205, in train
stats = ppo_trainer.step(
File "/usr/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/usr/local/lib/python3.10/dist-packages/trl/trainer/ppo_trainer.py", line 795, in step
train_stats = self.train_minibatch(
File "/usr/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/usr/local/lib/python3.10/dist-packages/trl/trainer/ppo_trainer.py", line 1068, in train_minibatch
self.accelerator.backward(loss)
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 1966, in backward
loss.backward(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/_tensor.py", line 522, in backward
torch.autograd.backward(
File "/usr/local/lib/python3.10/dist-packages/torch/autograd/__init__.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/usr/local/lib/python3.10/dist-packages/torch/autograd/function.py", line 289, in apply
return user_fn(self, *args)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/amp/autocast_mode.py", line 142, in decorate_bwd
return bwd(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/unsloth/kernels/fast_lora.py", line 131, in backward
d_downA = h.t() @ (dY @ downB.t())
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != float
I needed to call trl.trainer.peft_module_casting_to_bf16(model)