Ashwinee Panda comments

Results 36 comments of


                                            Ashwinee Panda

Lora downcasting issue

``` from unsloth import FastMistralModel model, tokenizer = FastMistralModel.from_pretrained( args.model_path, max_seq_length=512, dtype=torch.bfloat16, load_in_4bit=False, attn_implementation="flash_attention_2", device_map='auto', use_cache=False ) model = FastMistralModel.get_peft_model( model, r = 8, target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",...

Lora downcasting issue

Thanks, I was using FastLanguageModel initially but just was using FastMistralModel for debugging. And the only part of SFTTrainer that seems to actually be happening (because the error is on...

Lora downcasting issue

Gotcha. If we look at ``` temp = (dY @ downB.t()) ``` Then the error indicates that downB is a float32 (which is correct) but dY is a bfloat16. Should...

Lora downcasting issue

Sorry typo -meant "dY is a bfloat16" (from original error message).

Lora downcasting issue

So @danielhanchen making sure I understand this correctly; - training code works with PEFT LoRA because it downcasts everything to bfloat16 - in Unsloth PEFT we override LoraLayer.update_layer to not...

Lora downcasting issue

I'm currently getting this error with peft=0.10.0 and installing unsloth from source (git clone, pip install -e .) Here's the stacktrace; ``` ==((====))== Unsloth: Fast Mistral patching release 2024.4 \\...