torchtitan
torchtitan copied to clipboard
Fix the incorrect step log for profiler after resuming from a checkpoint
Summary: The profiler currently maintains a counter locally and that counter is not synchronized with the checkpointed train step. This PR fixes the issue.