nvbench icon indicating copy to clipboard operation
nvbench copied to clipboard

[FEA] Use NVML to manage clocks, etc

Open alliepiper opened this issue 5 years ago • 6 comments

Adding NVML as an optional dependency would allow some cool features:

  • Lock clock frequency.
    • Per-device default frequency.
    • Per-device maximum frequency.
    • Explicit frequency.
  • Log various device stats per measurement
    • SM/Mem clock frequencies.
    • Device utilization
    • Power state/usage
  • Check throttle state after each measurement.
    • Log a warning with the throttle reason and details (e.g. for thermal throttle, show current temp and thresholds).

alliepiper avatar Mar 18 '21 16:03 alliepiper

@allisonvacanti does it make sense to also print out an error/warning if the clocks aren't fixed? I know google bench has similar printouts for CPU governor settings.

cliffburdick avatar Jun 24 '21 17:06 cliffburdick

@cliffburdick Yes, that is something I plan to add if the NVML APIs provide this information.

alliepiper avatar Jun 24 '21 17:06 alliepiper

#46 added the basics:

  • Got NVML dependency sorted out
  • Locking clocks to base/max implemented for Volta (SM 7.0) and above
  • Toggling persistence mode on Linux

Still a lot more we can do here, though. Unassigning myself for now since I'll be busy with other work for a while.

alliepiper avatar Jan 13 '22 22:01 alliepiper