Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

[QUESTION]

Open suzewei opened this issue 6 months ago • 0 comments

Hi, I am training my Llama2-7b model with Megatron-LM, using four H20s, 32 GPUs in total. The parallel strategy is set to: TP=8/PP=2/DP=2. Now, I want to know the data capacity of different parallel groups communicating, is there some parameter setting to get these values, or if there is no such parameter, how can I get it from the code? Thank you.

suzewei avatar Aug 06 '24 11:08 suzewei