sightglass icon indicating copy to clipboard operation
sightglass copied to clipboard

Provide three-state output: "changed", "not changed", "unsure"

Open cfallin opened this issue 3 years ago • 0 comments

Right now, Sightglass uses a single threshold based on a confidence interval computed by Behrens-Fisher to determine whether a sampled statistic shifted between configurations.

The result of this is that we get either "changed" (i.e., benchmark got 5% faster) or "not changed". However, the latter answer can also appear if we simply don't have enough data points to prove statistical significance, or if the system is too noisy.

This "false negative" is somewhat dangerous: we could make a change, see that it is performance-neutral according to Sightglass, and accept it, but actually we just didn't turn the knobs up high enough.

Ideally, Sightglass should provide a third output of "unsure" if the measurements aren't precise enough to prove either "changed" or "not changed" to the desired confidence.

cfallin avatar Jul 28 '22 17:07 cfallin