odata.net icon indicating copy to clipboard operation
odata.net copied to clipboard

Performance refactor - elide async and await

Open gathogojr opened this issue 2 years ago • 2 comments

Issues

This pull request comprises a performance refactor - elide async and await

Description

When implementing asynchronous support, there are methods that I changed to async such that even bonehead exceptions would be caught by the state machine for async methods and placed on the returned task. Those category of methods take the following form:

async Task MethodAsync()
{
    VerifyYadaYada(); // Throws exception if expected conditions are not met
    await MethodAnotherAsync();
}

The VerifyYadaYada methods verifies for example that the stream is not disposed, or that a delta link is being written within a delta resource set, etc. Without the async keyword, the exception would be raised directly rather than get placed on the returned task. In this PR, I have dropped the async modifier from such methods for the following reasons:

  • Unless there's a bug in our code, bonehead exceptions resulting from verification failures should never happen.
  • On the strength of the above point, we trade-off pure asynchronous semantics - exception being placed on the task returned from the method - for performance. There's little sense in incurring a performance cost (construction of state machine, instantiation of Task objects, etc) for a condition that will almost never occur. If it occurred, the exception will still be caught by a method with the async modifier higher up in the call stack.

Another category of methods where I dropped the async modifier are those where the asynchronous logic is run conditionally. By placing the asynchronous logic in a local function, it becomes possible to elide async and await. We get a performance benefit especially where the asynchronous logic is almost never executed. Such methods take the following form:

async Task MethodAsync()
{
    if (true) // Check condition
    {
        // Asynchronous methods
    }
}

To avoid paying a higher cost when the condition is false, the rewritten method looks as follows:

Task MethodAsync()
{
    if (true) // Check condition
    {
        return MethodInnerAsync();

        async Task MethodInnerAsync()
        {
            // Asynchronous methods
        }
    }

    return TaskUtils.CompletedTask; // Static property that returns a completed task instance
}

This PR also elides async and await where all branches of logic in the asynchronous method consist of a single asynchronous method call. For example;

async Task MethodAsync()
{
    if (true) // Condition check
    {
        await MethodAnotherAsync();
    }
    else if (true) // Another condition check
    {
        await MethodYetAnotherAsync();
    }

    throw new ODataException("Houston, we have a problem");
}

We rewrite these category of methods as follows:

Task MethodAsync()
{
    if (true) // Condition check
    {
        return MethodAnotherAsync();
    }
    else if (true) // Another condition check
    {
        return MethodYetAnotherAsync();
    }

    return TaskUtils.GetFaultedTask(new ODataException("Houston, we have a problem"));
}

This PR also drops use of expensive methods defined in TaskUtils class - FollowOnSuccessWithTask, FollowOnSuccessWith, etc. Local functions that achieve the same purpose are introduced.

There are supported asynchronous method that exist but the logic in that method currently doesn't perform any asynchronous operation. For example, CreateODataResourceWriterAsync currently only initializes the writer to use for writing OData resources (same logic as CreateODataResourceWriter). It makes use of a method GetTaskForSynchronousOperation defined in TaskUtils class. That method is used as follows:

return TaskUtils.GetTaskForSynchronousOperation(
    () => this.CreateODataResourceWriterImplementation(navigationSource, resourceType));

Where CreateODataResourceWriterImplementation is a synchronous method.

I considered replacing all instances of GetTaskForAsynchronousOperation with Task.FromResult but held myself back since the GetTaskForAsynchronousOperation method does some exception handling and returns a faulted task for all exceptions except OutOfMemoryException. It also makes use of 2 helper methods (GetCompletedTask and GetFaultedTask). I did some benchmarking to see whether using Task.FromResult and Task.FromException instead would be better.

[MemoryDiagnoser]
public class CompletedTaskBenchmark
{
    private static readonly Type OutOfMemoryExceptionType = typeof(OutOfMemoryException);

    [Benchmark]
    public Task CompletedTaskUsingFromResultAsync()
    {
        return Task.FromResult("Task completed");
    }

    [Benchmark]
    public Task CompletedTaskUsingTaskCompletionSourceAsync()
    {
        return GetCompletedTask("Task completed");
    }

    [Benchmark]
    public Task CompletedTaskUsingGetTaskForSynchronousOperationAsync()
    {
        return GetTaskForSynchronousOperation(() => "Task completed");
    }

    [Benchmark]
    public Task CompletedTaskUsingGetTaskForSynchronousOperationRevisedAsync()
    {
        return GetTaskForSynchronousOperationRevised(() => "Task completed");
    }

    static Task<T> GetCompletedTask<T>(T value)
    {
        TaskCompletionSource<T> taskCompletionSource = new TaskCompletionSource<T>();
        taskCompletionSource.SetResult(value);
        return taskCompletionSource.Task;
    }

    internal static Task<T> GetFaultedTask<T>(Exception exception)
    {
        TaskCompletionSource<T> taskCompletionSource = new TaskCompletionSource<T>();
        taskCompletionSource.SetException(exception);
        return taskCompletionSource.Task;
    }

    static Task<T> GetTaskForSynchronousOperation<T>(Func<T> synchronousOperation)
    {
        try
        {
            T result = synchronousOperation();
            return GetCompletedTask(result);
        }
        catch (Exception ex)
        {
            Type type = ex.GetType();

            if (type == OutOfMemoryExceptionType)
            {
                throw;
            }

            return GetFaultedTask<T>(ex);
        }
    }

    static Task<T> GetTaskForSynchronousOperationRevised<T>(Func<T> synchronousOperation)
    {
        try
        {
            T result = synchronousOperation();
            return Task.FromResult(result);
        }
        catch (Exception ex)
        {
            Type type = ex.GetType();

            if (type == OutOfMemoryExceptionType)
            {
                throw;
            }

            return Task.FromException<T>(ex);
        }
    }
}

Below were the results:

|                                                       Method |      Mean |     Error |    StdDev |  Gen 0 | Allocated |
|------------------------------------------------------------- |----------:|----------:|----------:|-------:|----------:|
|                            CompletedTaskUsingFromResultAsync |  8.071 ns | 0.1713 ns | 0.2344 ns | 0.0172 |      72 B |
|                   CompletedTaskUsingTaskCompletedSourceAsync | 37.606 ns | 0.6161 ns | 0.5763 ns | 0.0229 |      96 B |
|        CompletedTaskUsingGetTaskForSynchronousOperationAsync | 40.309 ns | 0.4737 ns | 0.4431 ns | 0.0229 |      96 B |
| CompletedTaskUsingGetTaskForSynchronousOperationRevisedAsync | 13.357 ns | 0.2933 ns | 0.3709 ns | 0.0172 |      72 B |

From the above results, it's clear that the method is expensive. Using TaskCompletionSource to create a task instance has a higher cost than calling Task.FromResult. I have refactored the GetTaskForSynchronousOperation to make use of both Task.FromResult and Task.FromException (where possible - Task.FromException is not supported across all frameworks that OData supports)

Checklist (Uncheck if it is not completed)

  • [ ] Test cases added
  • [ ] Build and test with one-click build and test script passed

Additional work necessary

If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.

gathogojr avatar Aug 11 '22 06:08 gathogojr

Can we get some benchmarks for this change

KenitoInc avatar Aug 17 '22 12:08 KenitoInc

I ran your branch against some of the benchmarks in the repo. I found that it sped up the async writer scenarios (by up to 8%, awesome!) but it also increased memory allocations:

Before

Method WriterName Mean Error StdDev Gen 0 Gen 1 Allocated
WriteToFileAsync ODataMessageWriter-Async 900.471 ms 5.0685 ms 4.7411 ms 66000.0000 1000.0000 406,503 KB
WriteToFileAsync ODataMessageWriter-Utf8JsonWriter-Async 487.397 ms 0.5287 ms 0.4945 ms 44000.0000 1000.0000 273,429 KB

After

Method WriterName Mean Error StdDev Gen 0 Gen 1 Allocated
WriteToFileAsync ODataMessageWriter-Async 857.137 ms 2.1981 ms 1.9486 ms 67000.0000 1000.0000 408,658 KB
WriteToFileAsync ODataMessageWriter-Utf8JsonWriter-Async 448.118 ms 0.4156 ms 0.3887 ms 47000.0000 1000.0000 292,088 KB

I compared allocation reports from the VS profiler and found that display classes account for in the modified async methods are part of the cause for extra allocations:

Before

image

After

image

Here's a full report from the ResultsComparer tool that shows the difference in allocation size for different methods between master and your branch. You can investigate the methods in the New and Worse tables for potential regressions.

summary: better: 13, geomean: 1.678 worse: 25, geomean: 1.754 new (results in the diff that are not in the base): 15 missing (results in the base that are not in the diff): 14 total diff: 67

Worse diff/base Base Self Size (bytes) Diff Self Size (bytes) Modality
Microsoft.OData.Json.JsonWriter.WriteNameAsync() Infinity 0.00 544.00
Microsoft.OData.ODataWriterCore.WriteStartAsync(Microsoft.OData.ODataNestedResou Infinity 0.00 38400.00
Microsoft.OData.Json.JsonWriterAsyncExtensions.WritePrimitiveValueAsync(Microsof Infinity 0.00 80.00
Microsoft.OData.ODataMessageWriter<T>.WriteToOutputAsync.__WriteToOutputInnerAsy Infinity 0.00 1056.00
Microsoft.OData.JsonLight.ODataJsonLightWriter.WriteResourceSetDeltaLinkAsync(Sy Infinity 0.00 128.00
Microsoft.OData.JsonLight.ODataJsonLightWriter.WriteResourceSetNextLinkAsync(Sys Infinity 0.00 16160.00
Microsoft.OData.Json.ODataJsonWriterUtils.StartJsonPaddingIfRequiredAsync(Micros Infinity 0.00 64.00
Microsoft.OData.JsonLightInstanceAnnotationWriter.WriteInstanceAnnotationsAsync( Infinity 0.00 115488.00
Microsoft.OData.ODataWriterCore.WriteStartResourceImplementationAsync() 28.36842105263158 152.00 4312.00
Microsoft.OData.JsonLight.ODataJsonLightValueSerializer.WriteCollectionValueAsyn 4.52 600.00 2712.00
Microsoft.OData.ODataResponseMessage.GetStreamAsync() 3.375 128.00 432.00
Microsoft.OData.ODataWriterCore.WriteStartAsync() 2.573529411764706 544.00 1400.00
microsoft.aspnetcore.server.kestrel.core.il 2.5696681701030926 12416.00 31905.00
Microsoft.OData.ODataMessageWriter<T>.MoveNext() 1.561307901907357 5872.00 9168.00
Microsoft.OData.JsonLight.ODataJsonLightSerializer.WriteContextUriPropertyAsync( 1.3289473684210527 608.00 808.00
Microsoft.OData.ODataWriterCore.WriteStartResourceSetImplementationAsync() 1.2714570858283434 4008.00 5096.00
microsoft.aspnetcore.http.abstractions.il 1.16 200.00 232.00
Microsoft.OData.Json.JsonWriter.StartScopeAsync() 1.1457203877115163 48696.00 55792.00
Microsoft.OData.JsonLight.ODataJsonLightPropertySerializer.WritePropertyInfoAsyn 1.0935251798561152 1112.00 1216.00
Microsoft.OData.ODataWriterCore.WriteEndImplementationAsync.AnonymousMethod__199 1.0819672131147542 488.00 528.00
Microsoft.OData.Json.JsonValueUtils.WriteEscapedJsonStringValueAsync() 1.078125 512.00 552.00
microsoft.odata.core 1.0406494651047598 3730824.00 3882480.00
microsoft.aspnetcore.il 1.025681700849486 958815.00 983439.00
microsoft.aspnetcore.httpspolicy.il 1.0207206052057956 77990.00 79606.00
microsoft.extensions.hosting.abstractions.il 1.0045951714422188 496173.00 498453.00
Better base/diff Base Self Size (bytes) Diff Self Size (bytes) Modality
Microsoft.OData.ODataWriterCore.WriteStartAsync(Microsoft.OData.ODataResourceSet Infinity 1056.00 0.00
Microsoft.OData.Json.JsonWriter.WriteNameAsync(string) Infinity 1056.00 0.00
Microsoft.OData.JsonLight.ODataJsonLightPropertySerializer.WritePropertyAsync() 12.063829787234043 4536.00 376.00
Microsoft.OData.TaskUtils.GetTaskForSynchronousOperation<T>(System.Func<T>) 6.2 248.00 40.00
microsoft.aspnetcore.mvc.il 1.860798433894987 53230.00 28606.00
Microsoft.OData.JsonLight.ODataJsonLightWriter.StartResourceSetAsync() 1.6914414414414414 12016.00 7104.00
Microsoft.OData.ODataWriterCore.WriteStartResourceImplementationAsync.AnonymousM 1.1646489104116222 3848.00 3304.00
testserver 1.0358360906082973 40698.00 39290.00
System.Private.CoreLib.il 1.0145744493926645 6588335.00 6493693.00
microsoft.extensions.logging.console.il 1.012705798138869 11318.00 11176.00
system.private.uri.il 1.012666244458518 12792.00 12632.00
microsoft.aspnetcore.mvc.core.il 1.0023767082590611 701792.00 700128.00
experimentslib 1.0017169292402588 378066.00 377418.00
New diff/base Base Self Size (bytes) Diff Self Size (bytes) Modality
TestServer (PID: 24224) N/A 33810.00 N/A
Microsoft.OData.JsonLight.ODataJsonLightWriter.EndNestedResourceInfoWithContentA N/A 12800.00 N/A
Microsoft.OData.JsonLight.ODataJsonLightWriter.WriteResourceSetCountAsync(System N/A 9600.00 N/A
Microsoft.OData.ODataMessage.GetStreamAsync.__GetMessageStreamAsync0() N/A 0.00 N/A
Microsoft.OData.WriterValidator.ValidateTypeKind(Microsoft.OData.Edm.EdmTypeKind N/A 0.00 N/A
Microsoft.OData.ODataWriterCore.CheckForNestedResourceInfoWithContentAsync.Anony N/A 0.00 N/A
Microsoft.OData.UriParser.PropertySegment.ctor(Microsoft.OData.Edm.IEdmStructura N/A 0.00 N/A
Microsoft.OData.JsonLightInstanceAnnotationWriter.ctor(Microsoft.OData.JsonLight N/A 0.00 N/A
Microsoft.OData.JsonLight.ODataJsonLightWriterUtils.WriteInstanceAnnotationNameA N/A 0.00 N/A
Microsoft.OData.Metadata.EdmLibraryExtensions.GetCollectionItemTypeName(string, N/A 0.00 N/A
Microsoft.OData.Json.JsonWriter.EndObjectScopeAsync() N/A 0.00 N/A
Microsoft.OData.MediaTypeUtilsMatchInfoConcurrentCache.Add(MatchInfoCacheKey, Me N/A 0.00 N/A
Microsoft.OData.ODataWriterCore.CheckForNestedResourceInfoWithContentAsync.Anony N/A 0.00 N/A
microsoft.aspnetcore.diagnostics.il N/A 152.00 N/A
system.console.il N/A 46.00 N/A
Missing diff/base Base Self Size (bytes) Diff Self Size (bytes) Modality
TestServer (PID: 6408) N/A 33794.00 N/A
Microsoft.OData.ODataMessage.GetStreamAsync(System.Func<System.Threading.Tasks.T N/A 5670.00 N/A
Microsoft.OData.Json.JsonWriterAsyncExtensions.WritePrimitiveValueAsync() N/A 2192.00 N/A
Microsoft.OData.ODataWriterCore.VerifyCanWriteStartResourceSetAsync() N/A 1400.00 N/A
Microsoft.OData.TaskUtils.FollowOnSuccessWithContinuation<T>(System.Threading.Ta N/A 1384.00 N/A
Microsoft.OData.TaskUtils.FollowOnSuccessWithImplementation<T>(System.Threading. N/A 1216.00 N/A
Microsoft.OData.ODataMessage.GetStreamAsync.AnonymousMethod__0(System.Threading. N/A 666.00 N/A
Microsoft.OData.TaskUtils.IgnoreExceptions(System.Threading.Tasks.Task) N/A 208.00 N/A
Microsoft.OData.TaskUtils.GetCompletedTask<T>(T) N/A 48.00 N/A
Microsoft.OData.TaskUtils.cctor() N/A 24.00 N/A
Microsoft.OData.ODataWriterCore.WriteStartAsync(Microsoft.OData.ODataResource) N/A 0.00 N/A
Microsoft.OData.ODataWriterCore.VerifyCanWriteStartResourceSetAsync(bool, Micros N/A 0.00 N/A
microsoft.extensions.logging.abstractions.il N/A 24.00 N/A
system.net.security.il N/A 24.00 N/A

habbes avatar Aug 23 '22 05:08 habbes

I've re-run the "SerializationComparisons" benchmarks and now I see improvements in both performance and memory of up to 10%. Great work.

Before

Method WriterName Mean Error StdDev Gen 0 Gen 1 Allocated
WriteToFileAsync ODataMessageWriter-Async 911.242 ms 8.6172 ms 7.6389 ms 66000.0000 1000.0000 403,823 KB
WriteToFileAsync ODataMessageWriter-Utf8JsonWriter-Async 471.671 ms 0.6272 ms 0.5867 ms 44000.0000 1000.0000 271,267 KB

After

Method WriterName Mean Error StdDev Gen 0 Gen 1 Allocated
WriteToFileAsync ODataMessageWriter-Async 816.809 ms 2.3509 ms 2.0840 ms 59000.0000 1000.0000 362,984 KB
WriteToFileAsync ODataMessageWriter-Utf8JsonWriter-Async 429.522 ms 0.7683 ms 0.7187 ms 40000.0000 1000.0000 246,378 KB

habbes avatar Sep 19 '22 06:09 habbes

Can we get some benchmarks for this change

I added some comparison results for before and after and @habbes also run some benchmarks and shared the data in a comment

gathogojr avatar Sep 19 '22 20:09 gathogojr

This PR has 1533 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Extra Large
Size       : +980 -553
Percentile : 100%

Total files changed: 41

Change summary by file extension:
.cs : +980 -508
.bsl : +0 -45

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a balance between between PR complexity and PR review overhead. PRs within the optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detected.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification) of this PR in relation to all other PRs within the repository.


Was this comment helpful? :thumbsup:  :ok_hand:  :thumbsdown: (Email) Customize PullRequestQuantifier for this repository.