Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

Avoid calling set_save_original_input with FP8 delayed scaling

Open dalgarak opened this issue 3 months ago • 1 comments

This pull request fixes a missing condition in the FP8 delayed scaling check related to set_save_original_input().

When FP8 delayed scaling is enabled (--fp8-recipe 'delayed'), set_save_original_input() function should not be called, but the necessary condition was accidentally omitted in commit 08814e8 (ADLR/megatron-lm!4030 - perf(MoE): Support recomputation for FP8 layernorm/moe_act/shared_experts).

This PR adds the missing condition to ensure the correct behavior, and fixes an "AssertionError: DelayedScaling recipe is not supported with save_original_input" error in core_v0.14.0 released version.

dalgarak avatar Oct 14 '25 07:10 dalgarak

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar Oct 14 '25 07:10 copy-pr-bot[bot]

/ok to test ea52007

yaox12 avatar Dec 01 '25 01:12 yaox12

/ok to test 775f386

yaox12 avatar Dec 02 '25 05:12 yaox12

/ok to test 3465f3f

yaox12 avatar Dec 05 '25 01:12 yaox12

/ok to test a5f5cd5

yaox12 avatar Dec 05 '25 01:12 yaox12

Thank you for your contribution!

NVIDIA Megatron-LM is currently transitioning to development on Github. We will aim to review your PR after we complete our transition and stabilize our Github development process.

Thank you for your understanding.

github-actions[bot] avatar Dec 05 '25 01:12 github-actions[bot]