Neeraj Singh Aithani

Results 4 comments of Neeraj Singh Aithani

I am using 2x A5000 GPUs. I was able to train the T5 xl model using tensor-Parallelism.

Deepspeed supports model parallelism (MP) to fit large models that would otherwise not fit in GPU memory.

Hi @wookjeHan did you figure out how to do pipeline parallel with large hf models?