BenchmarkDotNet icon indicating copy to clipboard operation
BenchmarkDotNet copied to clipboard

High error rates across benchmark runs

Open lovettchris opened this issue 4 years ago • 4 comments

I have a nice set of benchmarks (which you can see here) and they run fine, and report nice low error rates around 0.2 to 2% range.

But when I do the same benchmark run over and over (on the same machine of course) then compare the error range across all of these runs, the picture is not so nice with amount of discrepancy across runs ranging up to 20%. This shows 10 invocations of my benchmark runner and you can see each iteration is nice and stable with small standard deviations (shown in light blue), but the steps across each iteration are unexpected:

image

What is worse, when I run those 10 iterations again, I get completely different set of steps, showing that this is completely random noise of some kind:

image

So while one benchmark run has nice small standard deviations, the test is not repeatable, which makes it really hard to use this for performance regression testing. Any ideas how I can improve this?

lovettchris avatar Jun 19 '20 21:06 lovettchris

Hi @lovettchris

Does your benchmark involve doing IO or any other thing that introduces a variance itself?

Or is it a 100% CPU bound code?

adamsitnik avatar Jun 26 '20 10:06 adamsitnik

It is CPU bound, no I/O of any kind. I did a simple test with this bench mark and the results were much better :

        [Benchmark]
        public void SimpleMath()
        {
            this.Sum = 0;
            for (int i = 0; i < 1000000; i++)
            {
                this.Sum += Math.Sqrt(this.Sum);
            }
        }
    }

With this pure math test I ran my benchmark runner 27 times, then I did that again to get another 27 and the results are consistent, with an error range about about 0.4% which I is good enough for my purposes. So this rules out Windows background activity, .NET framework overhead, and so on. So there must be something weird about my code that is creating much larger variations from test run to test run...

image

lovettchris avatar Jun 27 '20 00:06 lovettchris

@lovettchris since you are not the first person who asked this question, I started writing a detailed blog post about it. I should finish it this week. Ping me if I don't!

adamsitnik avatar Jun 29 '20 15:06 adamsitnik

@adamsitnik Hi! Do you know if I can find an explanation somewhere?

kcrg avatar Apr 11 '22 10:04 kcrg