apex
apex copied to clipboard
[DDP][Master Weight] For DDP + Master weight, is it necessary to set torch seed for training?
Hi,
When we are using APEX for DDP model training, is it necessary to set the seed at the beginning to make sure the master weight are same value among all ranks ? The PyTorch DDP broadcast only does collective operation on model's weight to keep the model are identical for all ranks, while for master weight, how does APEX to keep they are identical among all ranks?
Thank you.
Hello, is there someone help to give a explanation ?
why would you want to use apex's DDP at this moment? It's going to be removed in a few months
why would you want to use apex's DDP at this moment? It's going to be removed in a few months
Thank you for your attention. I don't use APEX DDP, i am using torch native DDP and APEX for model training with low precision datatype. I want to know is it necessary to set the torch seed at the beginning when running DDP with APEX master weight ? The master weight is used for mixed-precision training.
Thus, how does APEX make sure the master weight is same on all ranks ? Thank you.