tenset
tenset copied to clipboard
nsight compute not able to profile the kernels
I want to profile the kernels using ncu --target-processes all python3 measure_programs.py --target cuda
, but no kernels are profiled. Is it normal? How could I profile the kernels using nvidia profiler? (ncu, nsys, or nvprof?)