Vision-SR1 LoRA support

Thanks for your awesome work.

Does it support LoRA? If not, could you please provide some basic suggestions, since I don't have GPU of 80GB

Sep 01 '25 09:09 ppalantir

Currently EasyR1 does not support Lora. The official repo says to Use worker.actor.fsdp.torch_dtype=bf16 and worker.actor.optim.strategy=adamw_bf16 to enable bf16 training. If you still do not have enough memory, model scope/swift support Lora fine-tuning. We will be implementing a Lora version with model scope soon.

Sep 02 '25 19:09 zli12321

Thank you for your prompt reply!

I’m also curious—how much training time does Vision‑SR1 require for the cold‑start (SFT) stage and the subsequent GRPO (RL) stage, respectively?

Sep 04 '25 09:09 ppalantir

About 3 hour for SFT. For RL we cut off at some stopping points since it could probably take weeks to finish for 47K data.

Sep 04 '25 17:09 zli12321

Thanks for your reply!

Could you please give me some suggestions about how to cut off? use ctrl+c? and how to choose the stopping points.

And if the 47K data are not fully utilized since early cutoff is enough to achieve similar performance, then why do we need to collect 47K in the first place?

Thank you very much!

Sep 04 '25 17:09 ppalantir

Yes. Just use ctrl+c to kill the training program. The stopping point is to check the convergence of the validation dataset. When your expected validation dataset performance is converged, it is a good cut off point.

Since we are training with random shuffle, we did not know how much data we needed in the first place. There is a recent new release high quality Fine-Vision which we plan to integrate into the training.

Sep 05 '25 14:09 zli12321

Thanks!

However, when I ran the script "1-b_visionR1_train.sh" and stopped the training with Ctrl+C, I noticed that no checkpoints were saved in the "saves" directory. Could this be because there is a saving interval? I couldn’t find any hyperparameter in "1-b_visionR1_train.sh" to configure such an interval, and the epoch is set to 1 by default.

Sep 05 '25 15:09 ppalantir

Refer to Some of the training config files. Search for save_freq to modify the saving frequency. Currently it saves every 15 steps.

Sep 05 '25 17:09 zli12321