[Feature] How to finetune with deepspeed for larger parameters?

Open LiJiaqi96 opened this issue 9 months ago • 1 comments

Motivation

Thanks for your great work of InternVL2.5! For larger model parameters (38B or larger), the finetune scripts are implemented with srun + partition. Is it possible to implement with deepspeed? Thanks!

Related resources

No response

Additional context

No response

Apr 01 '25 11:04 LiJiaqi96

Hi, You can use the same script as for smaller models, but leverage DeepSpeed’s Zero-2 or Zero-3 to save memory.

Apr 24 '25 07:04 lll2343

Thanks a lot!

Jun 25 '25 10:06 LiJiaqi96