BenchmarkDotNet
BenchmarkDotNet copied to clipboard
Flaky tests because of AssertAllocations with InProcessEmitToolchain on Windows
MemoryDiagnoserTests.AssertAllocations frequently fails GitHub/Windows builds. Test examples: EngineShouldNotIntroduceBoxing, AwaitingTasksShouldNotInterfereAllocationResults.
Some recently failed builds:
https://github.com/dotnet/BenchmarkDotNet/runs/5218066245
2022-02-16T15:25:05.7264604Z [xUnit.net 00:19:53.52] EngineShouldNotIntroduceBoxing(toolchain: InProcessEmitToolchain) [FAIL]
2022-02-16T15:25:06.4522717Z Failed EngineShouldNotIntroduceBoxing(toolchain: InProcessEmitToolchain) [3 s]
2022-02-16T15:25:06.4523163Z Error Message:
2022-02-16T15:25:06.4523488Z Assert.Equal() Failure
2022-02-16T15:25:06.4523732Z Expected: 0
2022-02-16T15:25:06.4524180Z Actual: 100984
2022-02-16T15:25:06.4524599Z Stack Trace:
2022-02-16T15:25:06.4525647Z at BenchmarkDotNet.IntegrationTests.MemoryDiagnoserTests.AssertAllocations(IToolchain toolchain, Type benchmarkType, Dictionary`2 benchmarksAllocationsValidators) in D:\a\BenchmarkDotNet\BenchmarkDotNet\tests\BenchmarkDotNet.IntegrationTests\MemoryDiagnoserTests.cs:line 261
2022-02-16T15:25:06.4529765Z at BenchmarkDotNet.IntegrationTests.MemoryDiagnoserTests.EngineShouldNotIntroduceBoxing(IToolchain toolchain) in D:\a\BenchmarkDotNet\BenchmarkDotNet\tests\BenchmarkDotNet.IntegrationTests\MemoryDiagnoserTests.cs:line 129
https://github.com/dotnet/BenchmarkDotNet/runs/5108471012
2022-02-08T12:10:16.7090614Z [xUnit.net 00:16:37.89] AwaitingTasksShouldNotInterfereAllocationResults(toolchain: InProcessEmitToolchain) [FAIL]
2022-02-08T12:10:17.9153239Z Failed AwaitingTasksShouldNotInterfereAllocationResults(toolchain: InProcessEmitToolchain) [11 s]
2022-02-08T12:10:17.9154475Z Error Message:
2022-02-08T12:10:17.9155156Z Assert.Equal() Failure
2022-02-08T12:10:17.9155686Z Expected: 0
2022-02-08T12:10:17.9156065Z Actual: 100776
2022-02-08T12:10:17.9156480Z Stack Trace:
2022-02-08T12:10:17.9157631Z at BenchmarkDotNet.IntegrationTests.MemoryDiagnoserTests.AssertAllocations(IToolchain toolchain, Type benchmarkType, Dictionary`2 benchmarksAllocationsValidators) in D:\a\BenchmarkDotNet\BenchmarkDotNet\tests\BenchmarkDotNet.IntegrationTests\MemoryDiagnoserTests.cs:line 261
2022-02-08T12:10:17.9159533Z at BenchmarkDotNet.IntegrationTests.MemoryDiagnoserTests.AwaitingTasksShouldNotInterfereAllocationResults(IToolchain toolchain) in D:\a\BenchmarkDotNet\BenchmarkDotNet\tests\BenchmarkDotNet.IntegrationTests\MemoryDiagnoserTests.cs:line 150
For these particular tests, and in case of InProcessEmitToolchain , have you tried using GC.GetAllocatedBytesForCurrentThread instead of GC.GetTotalAllocatedBytes? While you can make sure the Jitting and GC threads don't run, you have no control what the test runner does in the background.
For example, this repro, which doesn't involve BDN at all, will always have some test failures, and only for the total bytes count. The number of failures is reduced if you set the processor count to 1, but even then they still happen.
using System;
using System.Runtime.CompilerServices;
using System.Threading;
using Xunit;
namespace TestProject1
{
public static class Engine {
public static int ParallelTestCount = 0;
[MethodImpl(MethodImplOptions.NoInlining)]
public static ValueTuple<int> ReturnsValueType() => new ValueTuple<int>(0);
public static void Run(int dummy)
{
int parallelTests = Interlocked.Increment(ref Engine.ParallelTestCount);
GC.Collect();
long bytesForThread = GC.GetAllocatedBytesForCurrentThread();
long bytesTotal = GC.GetTotalAllocatedBytes();
_ = dummy;
try
{
_ = ReturnsValueType();
GC.Collect();
bytesForThread = GC.GetAllocatedBytesForCurrentThread() - bytesForThread;
bytesTotal = GC.GetTotalAllocatedBytes() - bytesTotal;
Assert.Equal(1, parallelTests);
Assert.Equal(0, bytesForThread);
Assert.Equal(0, bytesTotal);
}
finally
{
Interlocked.Decrement(ref Engine.ParallelTestCount);
}
}
}
public class UnitTest1
{
private class MyData1 : TheoryData<int> { public MyData1() { for (int i = 0; i < 100; i++) Add(i); } }
[Theory] [ClassData(typeof(MyData1))] public void Test1(int dummy) { Engine.Run(dummy); }
private class MyData2 : TheoryData<int> { public MyData2() { for (int i = 0; i < 100; i++) Add(i); } }
[Theory] [ClassData(typeof(MyData2))] public void Test2(int dummy) { Engine.Run(dummy); }
private class MyData3 : TheoryData<int> { public MyData3() { for (int i = 0; i < 100; i++) Add(i); } }
[Theory] [ClassData(typeof(MyData3))] public void Test3(int dummy) { Engine.Run(dummy); }
}
}
@AndreyAkinshin
Addendum:
have you tried using
GC.GetAllocatedBytesForCurrentThreadinstead ofGC.GetTotalAllocatedBytes?
Yeah, that would render the tests meaningless, since the InProcessEmitToolChain starts a thread... (slightly embarrassed)
Still, the point remains, you have no control what the testhost/testrunner does. Even using vstest with --InIsolation doesn't help in this case.
We have disabled the xUnit parallelization for these tests:
https://github.com/dotnet/BenchmarkDotNet/blob/bd08722ee22fc25a0cf6d389125c5b6eff10cd8f/tests/BenchmarkDotNet.IntegrationTests/xunit.runner.json#L6-L7
So in theory given benchmark should be the only running thread except of Tiered JIT and GC threads.
GitHub/Windows
Perhaps we should exclude these tests for Windows?
https://github.com/dotnet/BenchmarkDotNet/blob/bd08722ee22fc25a0cf6d389125c5b6eff10cd8f/tests/BenchmarkDotNet.IntegrationTests/MemoryDiagnoserTests.cs#L40
We have disabled the xUnit parallelization for these tests:
Yes, I know, I did that too. I also included an assertion for that (because I'm paranoid). And I disabled the long-running test watchdog, since it has its own thread. Then there is a message sink thread that uses Sockets, and some other threads I couldn't make out the purpose of during a quick look with the Debugger - in short, a lot is going on in the background, even if the tests itself are not running parallel.
Why that seems to be a problem only on Windows I have no idea.
I guess the only way to un-flaky them is running the InProcessEmitToolchain tests out-of-process...
Perhaps we should exclude these tests for Windows?
Or that. 😀