torchtitan
torchtitan copied to clipboard
profile with modules and stack
I find these two arguments very helpful, maybe others do too.
with_stack has been caused timeout because it significant slow down the profiling for large models. It's better to make it optional.
So do you want that I add a job config argument for with_stack only? Or for both?
I think it's ok to add a job config. I'll let @fegin comment whether we need two, or one for both, or one for with_stack only.
with_stack only.