DeepSpeed Pipeline: Add support to eval micro bs configuration

Pipeline: Add support to eval micro bs configuration

Open nelyahu opened this issue 1 year ago • 0 comments

When running evaluation the general memory consumption is reduced. Mainly due to absence of gradients, and hanging FWD activations. It allows to increase the micro-bs and improve the evaluation performance. This commits add the option to pass num_micro_batches to eval_batch(), as the current assumption is that same micro-bs and global-bs is used, so same number micro batches will take place. This commit also modifies _scale_loss_by_gas in runtime/engine.py to consider number of eval micro batches for loss scaling instead of training gas.

Dec 21 '23 10:12 nelyahu

DeepSpeed DeepSpeed copied to clipboard

Pipeline: Add support to eval micro bs configuration

DeepSpeed
DeepSpeed copied to clipboard