composer
composer copied to clipboard
Stochastic Depth Determinism
There are a few problems with StochasticDepth determinism right now:
use_same_gpu_seedassumes each process has exactly the same seed when instead each process hasseed = user provided seed + global_rank- The
generatorobject created inapply_stochastic_depthneeds to be seeded otherwise theuse_same_depth_across_gpusfunctionality is broken because the generator for each process is initialized with a random seed.
** Environment ** Any multi-GPU environment
** To reproduce Train a model with StochasticDepth and the seed set on multiple GPUs.
Expected behavior
use_same_gpu_seed and use_same_depth_across_gpus should work deterministically.