Benjamin Bossan
Benjamin Bossan
> I think your understanding of the gradient approximation is right. Since LoRA-One needs to use the first-step gradients from full fine-tuning, we need the efficient approach from LoRA-GA to...
Thanks for proposing this fix @Aznix07. However, to apply this broadly requires a lot more changes. I have worked on those in #2893. I think this PR can be closed....
Just to be sure, this will be part of transformers v5?
Thanks for this detailed report. Debugging this type of issue can be really difficult, props for trying out a bunch of different things. At a first glance, I can't spot...
> This script seems to work (and wow it is much better than mine haha). I use LLama3-8B and everything can be saved locally, which is where my script fails,...
> I used the [bits and bytes guide](https://huggingface.co/docs/bitsandbytes/main/en/fsdp_qlora#training) which actually uses the [PEFT example repo](https://github.com/huggingface/peft/tree/main/examples/sft). > It seems that both guides work as they reference the same example in the...
I tried to reproduce but still have very little experience with DeepSpeed, so I may be doing something wrong. When I try to start the script with `accelerate launch`, I...
> In the end, I solved the issue using DeepSpeed + QLoRA for [example](https://github.com/huggingface/peft/tree/main/examples/sft). > > And I tried actions such as changing the versions of `PEFT` and `Accelerate`, but...
Could you show us how you launch the script? Also, from the last nvidia-smi output you posted, memory usage is 13532MiB and 12328MiB. This looks rather fine to me, I...
> My tarining script is provided in the reproduction section above. Yes, I mean how do you launch the training script exactly? > 3\. While training with deepspeed did not...