IPUToolkit.jl icon indicating copy to clipboard operation
IPUToolkit.jl copied to clipboard

Use also upper counter for more accurate benchmarking

Open giordano opened this issue 11 months ago • 0 comments

We can use __builtin_ipu_get_scount_u, together with __builtin_ipu_get_scount_l, to get the full 64-bit cycle counter. The challenge is that when calling both builtins in a row (ideally first the upper and the lower) may overflow the lower and flip the upper, so we need to manually deal with the case where the lower counter is less than 12 (or 6?).

giordano avatar Jul 21 '23 09:07 giordano