WindowsCommunityToolkit icon indicating copy to clipboard operation
WindowsCommunityToolkit copied to clipboard

Feature/stateful dispatcherqueue extensions

Open Sergio0694 opened this issue 4 years ago • 4 comments

Fixes https://github.com/microsoft/microsoft-ui-xaml/issues/3321

NOTE: turns out we can implement that requested functionality ourselves without the need for the WinUI team to do any changes to the WinRT assemblies they're shipping. This is also convenient as just adding new APIs externally is much easier than doing changes to WinRT component. Also WinRT APIs can't be generics anyway. This also gives us more flexibility, as in theory we could add all sorts of other overloads in the future, if we wanted (eg. stateful overloads for our asynchronous extensions).

PR Type

What kind of change does this PR introduce?

  • Feature
  • Optimization

What is the current behavior?

The DispatcherQueue.TryEnqueue API does not have a stateful overload, meaning that (as I've mentioned here), the only way to pass some state to the callback is by using a C# closure. This is problematic from a performance standpoint, especially in cases where this method is invoked very often. Consider a classic example where we want to invoke some method on a control, from the UI thread:

MyControl control = ...;
DispatcherQueue.TryEnqueue(() => control.DoSomething());

This will result in the following code:

// Display class
[CompilerGenerated]
private sealed class <>c__DisplayClass1_0
{
    public MyControl control;

    internal void <Dispatch_WithCapture>b__0()
    {
        control.DoSomething();
    }
}

// The dispatch code
<>c__DisplayClass1_0 <>c__DisplayClass1_ = new <>c__DisplayClass1_0();
<>c__DisplayClass1_.control = control;
dispatcherQueue.TryEnqueue(new Action(<>c__DisplayClass1_.<Dispatch_WithCapture>b__0));

That is, for each invocation we're allocating:

  • The closure class with the captured values (new <>c__DisplayClass1_0())
  • The delegate itself wrapping that closure class (new Action(<>c__DisplayClass1_.<Dispatch_WithCapture>b__0))
  • On top of this, a bunch of CsWinRT wrapping/marshalling types

What is the new behavior?

This PR solves all the issues mentioned above by adding the following new APIs:

namespace CommunityToolkit.WinUI
{
    public delegate void DispatcherQueueHandler<in T>(T state)
        where T : class;

    public delegate void DispatcherQueueHandler<in T1, in T2>(T1 state1, T2 state2)
        where T1 : class
        where T2 : class;

    public static partial class DispatcherQueueExtensions
    {
        public static unsafe bool TryEnqueue<T>(this DispatcherQueue dispatcherQueue, DispatcherQueueHandler<T> callback, T state)
            where T : class;
        public static unsafe bool TryEnqueue<T>(this DispatcherQueue dispatcherQueue, DispatcherQueuePriority priority, DispatcherQueueHandler<T> callback, T state)
            where T : class;
        public static unsafe bool TryEnqueue<T1, T2>(this DispatcherQueue dispatcherQueue, DispatcherQueueHandler<T1, T2> callback, T1 state1, T2 state2)
            where T1 : class
            where T2 : class;
        public static unsafe bool TryEnqueue<T1, T2>(this DispatcherQueue dispatcherQueue, DispatcherQueuePriority priority, DispatcherQueueHandler<T1, T2> callback, T1 state1, T2 state2)
            where T1 : class
            where T2 : class;
    }
}

These let developers explicitly pass a state which will be forwarded to the supplied delegate, so that the delegate itself can be stateless. This in turn lets the C# compiler statically cache it. In practice, we're no longer allocating closure classes, nor delegates, and not even interop wrappers at all, as we're doing everything manually here. Here's some benchmarks:

Method Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated
TryEnqueueWithClosure 7.062 us 0.1353 us 0.1852 us 1.00 0.0687 - - 312 B
TryEnqueueWithState 1.139 us 0.0229 us 0.0676 us 0.15 - - - -
TryEnqueueWithClosure2 6.798 us 0.1318 us 0.1465 us 1.00 0.0763 - - 320 B
TryEnqueueWithState2 1.071 us 0.0554 us 0.1607 us 0.14 - - - -

You can see how the new APIs are almost 7x faster than the built-in ones, and they're also completely allocation free 🚀

This might be especially useful for library authors writing controls and trying to reduce their own overhead on consumers.

PR Checklist

Please check if your PR fulfills the following requirements:

  • [X] Tested code with current supported SDKs
  • [ ] Pull Request has been submitted to the documentation repository instructions. Link:
  • [ ] Sample in sample app has been added / updated (for bug fixes / features)
  • [X] New major technical changes in the toolkit have or will be added to the Wiki e.g. build changes, source generators, testing infrastructure, sample creation changes, etc...
  • [X] Tests for the changes have been added (for bug fixes / features) (if applicable)
  • [X] Header has been added to all new source files (run build/UpdateHeaders.bat)
  • [X] Contains NO breaking changes

Sergio0694 avatar Jul 05 '21 10:07 Sergio0694

Thanks Sergio0694 for opening a Pull Request! The reviewers will test the PR and highlight if there is any conflict or changes required. If the PR is approved we will proceed to merge the pull request 🙌

ghost avatar Jul 05 '21 10:07 ghost

Any tests we can add?

michael-hawker avatar Jul 07 '21 21:07 michael-hawker

@michael-hawker Yup, I was thinking we might want to add some tests as well, and/or use these extensions somewhere in the sample app so that we can test them interactively as well. I also just wanted to get your thoughts and Alex's on this proposal too though, as in, whether having code like this is ok in the Toolkit. I understand there's some... Uh... Trickery involved ahah 😄

Sergio0694 avatar Jul 09 '21 12:07 Sergio0694

Added a slight implementation change in https://github.com/CommunityToolkit/WindowsCommunityToolkit/pull/4097/commits/9e6431f894593221c14e8a8ad48d3f30e67c8f2e after talking with @jakobbotsch (thanks again for the help!) to remove the type safety violation and ref type aliasing we were doing over the stored delegate in order to invoke it as contravariant in its inputs (which was in turn a workaround the lack of generic support for methods being invoked from native code due to the generic context). That addresses the potential issue with that trick that has been discussed with Jan and Levi here and here.

This PR should be good for an initial review pass now (and as previously discussed with Michael, I'm also waiting for confirmation that we're fine with integrating code like this into the Toolkit in general, given its "unfriendliness"). The only other missing optimization left is to just swap those Marshal.AllocHGlobal calls for NativeMemory.Alloc once we upgrade to .NET 6, but that is not critical to how this feature works and we can just do that in a follow up PR following the move to .NET 6 for WASDK 1.0 😄

Sergio0694 avatar Sep 12 '21 17:09 Sergio0694