pyperf icon indicating copy to clipboard operation
pyperf copied to clipboard

Output the probability of being faster (slower) when compare results

Open serhiy-storchaka opened this issue 7 years ago • 5 comments

When compare two results it would be helpful to output a probability of one result be faster then other.

If times1 and times2 are sets of measured times, then the probability of the first benchmark being faster than the second one is estimated as:

sum(x < y for x in times1 for y in times2)/len(times1)/len(times2)

Actually you can sort one of sets and use binary search for optimization.

serhiy-storchaka avatar Mar 09 '17 07:03 serhiy-storchaka

I like the idea. You formula only works if the two lists have the same number of samples, right?

vstinner avatar Mar 09 '17 09:03 vstinner

No, it works for different numbers of samples. The formula itself has the computational complexity O(n*m), but may be optimized to O(n*log(m)) if use binary search or even to linear O(n+m). First try the simplest formula and optimize it if it is too slow.

serhiy-storchaka avatar Mar 09 '17 12:03 serhiy-storchaka

The estimated error of the probability estimation p is about:

sqrt(p*(1-p)/len(times1)/len(limes2))

serhiy-storchaka avatar Mar 09 '17 12:03 serhiy-storchaka

@serhiy-storchaka would you be interested to write a pull request to implement this idea?

vstinner avatar Jan 11 '19 15:01 vstinner

I opened https://github.com/psf/pyperf/pull/118. Is there still interest?

sweeneyde avatar Oct 21 '21 21:10 sweeneyde