facer icon indicating copy to clipboard operation
facer copied to clipboard

gpu memory and latency time

Open StephanPan opened this issue 1 year ago • 3 comments

During the model warm-up, especially in the initial steps, there is significant fluctuation in GPU memory, and the process takes a considerable amount of time, reaching around twenty seconds. The GPU memory and the latency time is not that stable. Is there any suggestions please? thx.

StephanPan avatar Oct 26 '23 02:10 StephanPan

Normally, GPU usage is around 2400 M, but occasionally, it may exceed 10 G.

StephanPan avatar Oct 26 '23 02:10 StephanPan

Does this UserWarning appear? You can turn it off below

UserWarning: operator() profile_node %385 : int[] = prim::profile_ivalue(%383) does not have profile information (Triggered internally at ../third_party/nvfuser/csrc/graph_fuser.cpp:104.)
torch._C._jit_set_profiling_executor(False)
torch._C._jit_set_profiling_mode(False)

Ttayu avatar Oct 26 '23 05:10 Ttayu

Does this UserWarning appear? You can turn it off below

UserWarning: operator() profile_node %385 : int[] = prim::profile_ivalue(%383) does not have profile information (Triggered internally at ../third_party/nvfuser/csrc/graph_fuser.cpp:104.)
torch._C._jit_set_profiling_executor(False)
torch._C._jit_set_profiling_mode(False)

thanks,it works.

StephanPan avatar Nov 23 '23 02:11 StephanPan