Minghao Yan comments

Results 7 comments of


                                            Minghao Yan

Compatibility issue with CUDA 12.2

I disabled all occurrences of bf16. If you need to use bf16 then I am not sure.

LoRA fine-tuning weights explosion in FSDP training

Thanks for your reply! I think I might not be loading correctly. I disabled all Lora implementation and reverted to the default llama-3-8b setup. Currently I am trying to copy...

LoRA fine-tuning weights explosion in FSDP training

Thank you very much for the pointer! After some more investigation, it does seem like the first step loss is too high (without any LoRA or any training) after loading...

LoRA fine-tuning weights explosion in FSDP training

Thank you for your reply! I moved load_from_full_model_state_dict to after model.to_empty(...), if I keep torch device as cpu, the behavior is the same. If I change device to cuda, it...

LoRA fine-tuning weights explosion in FSDP training

Thank you! I have created a PR here: #427

LoRA fine-tuning weights explosion in FSDP training

I was not aware of this, thank you!

AttributeError: 'PeftModelForCausalLM' object has no attribute '_unwrapped_old_generate'

I'm running into the same problem.