CHNRyan
CHNRyan
Same error! When I fine tuning llama2 with zero3 and qlora, parameters will be fully loaded to each GPU and then partitioned. I think they should be partitioned first and...
> Maybe this link will help, https://huggingface.co/docs/transformers/main/en/deepspeed?models=pretrained+model#non-trainer-deepspeed-integration @Taiinguyenn139 Thanks for your reply! I have tried it but I lose. Here is my code, maybe it is not correct: ``` import...
> (1) In my experience, you can run ZeRO 3 with SFTrainer or Trainer (2) I dont use accelerate but I use deepspeed command like this > > ``` >...