LoRA support
Thanks for your awesome work.
Does it support LoRA? If not, could you please provide some basic suggestions, since I don't have GPU of 80GB
Currently EasyR1 does not support Lora. The official repo says to Use worker.actor.fsdp.torch_dtype=bf16 and worker.actor.optim.strategy=adamw_bf16 to enable bf16 training. If you still do not have enough memory, model scope/swift support Lora fine-tuning. We will be implementing a Lora version with model scope soon.
Thank you for your prompt reply!
I’m also curious—how much training time does Vision‑SR1 require for the cold‑start (SFT) stage and the subsequent GRPO (RL) stage, respectively?
About 3 hour for SFT. For RL we cut off at some stopping points since it could probably take weeks to finish for 47K data.
Thanks for your reply!
Could you please give me some suggestions about how to cut off? use ctrl+c? and how to choose the stopping points.
And if the 47K data are not fully utilized since early cutoff is enough to achieve similar performance, then why do we need to collect 47K in the first place?
Thank you very much!
Yes. Just use ctrl+c to kill the training program. The stopping point is to check the convergence of the validation dataset. When your expected validation dataset performance is converged, it is a good cut off point.
Since we are training with random shuffle, we did not know how much data we needed in the first place. There is a recent new release high quality Fine-Vision which we plan to integrate into the training.
Thanks!
However, when I ran the script "1-b_visionR1_train.sh" and stopped the training with Ctrl+C, I noticed that no checkpoints were saved in the "saves" directory. Could this be because there is a saving interval? I couldn’t find any hyperparameter in "1-b_visionR1_train.sh" to configure such an interval, and the epoch is set to 1 by default.
Refer to Some of the training config files. Search for save_freq to modify the saving frequency. Currently it saves every 15 steps.