Megatron-LM
Megatron-LM copied to clipboard
MuonClip support (non-split version)
Added MLA and MHA(GQA) clipping support
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
TE's https://github.com/NVIDIA/TransformerEngine/pull/2195 (2.9.0) is needed for this PR
/ok to test 7917e68
/ok to test 495f58d
/ok to test 55cc00d
/ok to test 9095615
Boxiang, I suppose you need someone in the expert reviewer list to review.
/ok to test 1bb0407
/ok to test b63c573
/ok to test 95fdba3
Can we re-name this PR? It should just be "QK logits clipping" or something similar?
/ok to test 6562a52