torchtitan icon indicating copy to clipboard operation
torchtitan copied to clipboard

How to use nsys?

Open vedantroy opened this issue 1 year ago • 1 comments

Is there a recommended way to use nsys / nsight? I know there's a profiling hook for using the Pytorch profiler, but I'm wondering how to use nsys instead.

Can I use these APIs:

with torch.autograd.profiler.emit_nvtx():
    profiler.start()
    y = x.view(1, -1)
    z = x.to(memory_format=torch.channels_last)
    zz = z.reshape(1, -1)
    profiler.stop()

Furthermore, I'm not sure which of the below I'm supposed to use:

    import torch.cuda.profiler as profiler
    with torch.autograd.profiler.emit_nvtx():

vedantroy avatar Jun 13 '24 18:06 vedantroy

Hey @vedantroy, IIUC emit_nvtx is just adding addititonal information into the trace. To actually profile your program with nsys, you have start your program with it (e.g., nsys profile --gpu-metrics-device=0 -o [output] [command]).

yifuwang avatar Jun 28 '24 17:06 yifuwang