gdGPT
gdGPT copied to clipboard
Any plan to incorporate tensor parallelism or zero data parallelism?
trafficstars
Would it be possible in this framework that the pipeline is incorporated to tensor parallelism or zero data parallelism?
Hi,
Thanks for being interested in this repo !!!
Is there any experiments post that adding tensor parallelism or zero would improve training performance ?
Not really.
However, there are projects that use pipeline with tensor parallelism together for efficiency such like megatron. And I believe this project offers a better solution since it only depends on deepspeed without heavy dependencies as in megatron.
As for pipeline with zero, I have not seen any other projects did this.