kineto
kineto copied to clipboard
[Discussion] Which clock should we be using for timestamps?
To follow-up on the discussion in https://github.com/pytorch/kineto/pull/868, we can continue the discussion for which clock to use for timestamp collection.
@mwootton pointed out that we should always be using a monotonic clock so that we avoid ordering issue of start and end timestamps for short kernels.
@mantaionut updated the Windows CUPTI clock to use std::chrono::system_clock
via cuptiActivityRegisterTimestampCallback
, previously it was converting CUPTI's steady_clock
to system_clock
in post processing. This caused @mantaionut to observe flaky CI tests.
By default CUPTI documentation uses Monotonic for Windows, and CLOCK_REALTIME (non-monotonic) for Linux:
This function registers a callback function to obtain timestamp of user's choice instead of using CUPTI provided timestamp. By default CUPTI uses different methods, based on the underlying platform, to retrieve the timestamp Linux and Android use clock_gettime(CLOCK_REALTIME, ..) Windows uses QueryPerformanceCounter() Mac uses mach_absolute_time() QNX uses ClockCycles() Timestamps retrieved using these methods are converted to nanosecond if needed before usage.
Let's continue the discussion of choosing which clock to use in Kineto:
- Monotonic clock (steady_clock)
- Real time clock (system_clock)
- TSC clock (calibrate this to either monotonic or non-monotonic).
The TSC clock is much faster than the steady or system clocks, it was part of the PyTorch Profiler's overhead overhaul (https://github.com/pytorch/pytorch/pull/73855). The implementation in PyTorch Profiler is monotonic, fast, and would help align Kineto timestamps to the same clock as PT Profiler.
cc @chaekit, @briancoutinho, @davidberard98
Monotonic clock does make sense. Also @aaronenyeshi your comment here further agrees with this
The implementation in PyTorch Profiler is monotonic, fast, and would help align Kineto timestamps to the same clock as PT Profiler.
Switched to TSC based clock in https://github.com/pytorch/pytorch/pull/125036