DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

DeepSpeedCheckpoint: support custom final ln idx

Open nelyahu opened this issue 9 months ago • 0 comments

till today only last layer (idx=-1) was considered using FINAL_LAYER_NORM_INDEX which is set to -1. this PR allows the user to pass custom value for model where this default value does not apply. see example for usage in HabanaAI/Megatron-DeepSpeed fork repository: https://github.com/HabanaAI/Megatron-DeepSpeed/blob/c9feb8cacabc6dd4da4266cff08db555a21122e2/tools/verify_checkpoint_non_tp_consistency.py#L296

nelyahu avatar May 08 '24 09:05 nelyahu