BenchmarkDotNet
BenchmarkDotNet copied to clipboard
Reduce async overhead
Followup to #1941 (fixes #1595 and fixes #1738)
Refactored delegates to reduce async measurement overhead (sync measurements are the same).
Also added IterationSetup/Cleanup async support.
Master:
| Method | Mean | Error | StdDev | Gen 0 | Allocated |
|---------------------------- |--------------:|-----------:|-----------:|-------:|----------:|
| AsyncYieldTaskVoid | 1,777.6317 ns | 11.4813 ns | 10.1779 ns | 0.0381 | 120 B |
| AsyncImmediateTaskVoid | 2.3197 ns | 0.0294 ns | 0.0246 ns | - | - |
| AsyncYieldValueTaskVoid | 1,852.0409 ns | 2.9352 ns | 2.7456 ns | 0.0381 | 120 B |
| AsyncImmediateValueTaskVoid | 15.3422 ns | 0.1608 ns | 0.1505 ns | - | - |
| SyncVoid | 0.1591 ns | 0.0042 ns | 0.0037 ns | - | - |
| AsyncYieldTaskInt | 1,850.0632 ns | 13.8105 ns | 12.2426 ns | 0.0381 | 120 B |
| AsyncImmediateTaskInt | 39.0781 ns | 0.2674 ns | 0.2371 ns | 0.0229 | 72 B |
| AsyncYieldValueTaskInt | 1,889.4795 ns | 19.5146 ns | 18.2539 ns | 0.0401 | 128 B |
| AsyncImmediateValueTaskInt | 41.2959 ns | 0.1417 ns | 0.1183 ns | - | - |
| SyncInt | 0.3043 ns | 0.0089 ns | 0.0083 ns | - | - |
This PR:
| Method | Job | UnrollFactor | Mean | Error | StdDev | Gen 0 | Allocated |
|---------------------------- |----------- |------------- |--------------:|-----------:|------------:|-------:|----------:|
| SyncVoid | DefaultJob | 16 | 0.1627 ns | 0.0058 ns | 0.0051 ns | - | - |
| SyncInt | DefaultJob | 16 | 0.3143 ns | 0.0363 ns | 0.0322 ns | - | - |
| AsyncYieldTaskVoid | Job-CXHPYP | 1 | 1,461.6091 ns | 80.2756 ns | 235.4344 ns | 0.0381 | 120 B |
| AsyncImmediateTaskVoid | Job-CXHPYP | 1 | 4.4395 ns | 0.0255 ns | 0.0238 ns | - | - |
| AsyncYieldValueTaskVoid | Job-CXHPYP | 1 | 1,543.3745 ns | 30.5159 ns | 57.3162 ns | 0.0381 | 120 B |
| AsyncImmediateValueTaskVoid | Job-CXHPYP | 1 | 11.1546 ns | 0.0417 ns | 0.0390 ns | - | - |
| AsyncYieldTaskInt | Job-CXHPYP | 1 | 1,597.0157 ns | 15.7257 ns | 13.9404 ns | 0.0381 | 120 B |
| AsyncImmediateTaskInt | Job-CXHPYP | 1 | 43.2262 ns | 0.1489 ns | 0.1393 ns | 0.0229 | 72 B |
| AsyncYieldValueTaskInt | Job-CXHPYP | 1 | 1,576.6878 ns | 30.8910 ns | 53.2854 ns | 0.0401 | 128 B |
| AsyncImmediateValueTaskInt | Job-CXHPYP | 1 | 33.3715 ns | 0.2657 ns | 0.2356 ns | - | - |
public class Benchmark
{
public long counter;
[Benchmark]
public async Task AsyncYieldTaskVoid()
{
await Task.Yield();
unchecked { ++counter; }
}
[Benchmark]
public Task AsyncImmediateTaskVoid()
{
unchecked { ++counter; }
return Task.FromResult(true);
}
[Benchmark]
public async ValueTask AsyncYieldValueTaskVoid()
{
await Task.Yield();
unchecked { ++counter; }
}
[Benchmark]
public ValueTask AsyncImmediateValueTaskVoid()
{
unchecked { ++counter; }
return new ValueTask();
}
[Benchmark]
public void SyncVoid()
{
unchecked { ++counter; }
}
[Benchmark]
public async Task<long> AsyncYieldTaskInt()
{
await Task.Yield();
unchecked { return ++counter; }
}
[Benchmark]
public async Task<long> AsyncImmediateTaskInt()
{
unchecked { return ++counter; }
}
[Benchmark]
public async ValueTask<long> AsyncYieldValueTaskInt()
{
await Task.Yield();
unchecked { return ++counter; }
}
[Benchmark]
public async ValueTask<long> AsyncImmediateValueTaskInt()
{
unchecked { return ++counter; }
}
[Benchmark]
public long SyncInt()
{
unchecked { return ++counter; }
}
}
If you want, we can close #1941 in favor of this, but there are more changes here which should be reviewed. Or #1941 can be merged before this, which is also fine.
Merging this PR will also make it easier/possible to add async engine support in the future.
cc @AndreyAkinshin @adamsitnik @stephentoub