Selvaraj Anandaraj
Selvaraj Anandaraj
This PR adds an env variable to be set when you want to log peak memory usage at the end of each training step. This env variable "NEMO_LOG_MEMORY_USAGE" should be...
# What does this PR do ? This PR adds an interface argument to support the MCore Partial DistOpt feature. The argument `partial_data_parallel_shard_factor` determines the the level of sharding that...
# Description Added a feature to implement double buffer while reloading activations from CPU to GPU. This helps reduce memory fragmentation when using CPU offloading close to GPU peak memory....