tensor_parallel icon indicating copy to clipboard operation
tensor_parallel copied to clipboard

Does tensor_parallel support multi-node tensor parallel training?

Open liguodongiot opened this issue 1 year ago • 5 comments

liguodongiot avatar Jun 07 '23 09:06 liguodongiot

I want to konw too.

zhangjunyi111 avatar Aug 16 '23 06:08 zhangjunyi111

@BlackSamorez Hope you can answer this question 😄😄

longday1102 avatar Aug 27 '23 18:08 longday1102

@BlackSamorez I have 2 servers with a total of 16 GPUs, so I would love to be able to use multi-nodes tensor-parallel to train a large language model, for example Bloom 176B. So I hope you can answer how to use multi-nodes tensor-parallel. Thank you very much

longday1102 avatar Aug 27 '23 18:08 longday1102

Is this solved? if so, how?

PieterZanders avatar May 10 '24 19:05 PieterZanders

same question.

deema-A avatar Aug 14 '24 03:08 deema-A

ahaha everybody have same problem but I think there is no feature like this but we absolutely need it. Recently I tried DeepSpeed which is developing from microsoft, maybe it has but Microsoft's code doesn't suppport Windows 😄

Tezcan98 avatar Oct 08 '24 14:10 Tezcan98