InternVL
InternVL copied to clipboard
[Feature] How to finetune with deepspeed for larger parameters?
Motivation
Thanks for your great work of InternVL2.5! For larger model parameters (38B or larger), the finetune scripts are implemented with srun + partition. Is it possible to implement with deepspeed? Thanks!
Related resources
No response
Additional context
No response
Hi, You can use the same script as for smaller models, but leverage DeepSpeed’s Zero-2 or Zero-3 to save memory.
Thanks a lot!