torchtitan icon indicating copy to clipboard operation
torchtitan copied to clipboard

[Request] Decouple profiler `profile_freq` from memory snapshot frequency

Open awgu opened this issue 1 year ago • 0 comments

For memory snapshot, we usually only need to take a snapshot on one of the first few iterations (e.g. step 2 or 3) after the optimizer step has run on step 1. We do not need to repeatedly take snapshots. We cannot do this easily since the memory profiler uses the same profile_freq as the trace profiler.

awgu avatar Jul 23 '24 01:07 awgu