Avalonia icon indicating copy to clipboard operation
Avalonia copied to clipboard

Regression in startup time 0.9 -> 0.10.

Open grokys opened this issue 3 years ago • 13 comments

Describe the bug

One of our customers has made us aware of a regression in startup time between 0.9 and 0.10. Tested this on my machine and found that this is indeed the case.

I measured the startup time in a blank app by starting a System.Diagnostics.Stopwatch in Program.Main and stopping it in the main window Activated event. The program was compiled in a standard release build and with a ReadyToRun (R2R) deployment:

Version Theme Startup R2R
0.9.12 Default 0.36s
0.10.0 Default 0.71s
0.10.0 Default 0.45s x
0.10.0 Fluent 0.85s
0.10.0 Fluent 0.49s x

Wasn't able to test 0.9 in R2R mode as the program fails to run.

I will report the results of my investigation below.

grokys avatar Mar 15 '21 13:03 grokys

The difference between 0.9.12 and 0.10.0 with the default theme is pretty big. However building it in R2R mode mostly offsets the difference. We suspect that this degradation in performance was caused by #3255, which introduces a typed value store where previously we had a boxing value store. The extensive usage of generics there is causing the JIT to do more work on startup.

Profiling in dotTrace using the Timeline profiler seems to bear this out:

image

(0.9 on the left, 0.10 on the right. Note that the times here are greater as running in the profiler slows the app down significantly)

Where we've shaved off 500ms in user code we've added 400ms of JIT time. Not sure why the majority of time here is showing up as Native code though.

grokys avatar Mar 15 '21 13:03 grokys

My next port of call was to try to work out just why the Fluent theme is so much slower. It's a much bigger more complex theme than Default of course so there would be some performance cost to be expected; however I wanted to see if I could improve startup time with this theme as this is what most people would be using.

My initial profiling led me to see that we're calling StyleInclude.TryGetResource nearly 50,000 times on startup and that we're doing more than 80,000 resource lookups. When the SystemAccentColor is read for the first time, these are the resulting ~8000 calls to StyleInclude.TryGetResource:

https://gist.github.com/grokys/5d1a3ded2509808807b5e699f5876672

That doesn't look good. What's happening is that the first call to get a resource causes the fluent theme to get loaded which triggers all the resources in those styles to be looked up, and for each resource lookup all the XAML files in the fluent theme need to be searched. The process of loading the fluent theme takes around 0.5s in total.

grokys avatar Mar 15 '21 13:03 grokys

I suspect this use of generics also caused corert builds to grow significantly. The value store rather fills the output with

<IlcArg Include="-#-verbose" />
<IlcArg Include="-#-singlethreaded" />

on a corert build

ahopper avatar Mar 15 '21 13:03 ahopper

I suspect this use of generics also caused corert builds to grow significantly.

Yes, in fact the reason I knew to look immediately at JIT time was due to a conversation in the runtimelab repository.

Thing is, we started off with a boxing/reflection-based which was criticized because it was slow, caused a lot of allocations and didn't work well with CoreRT.

So what do you use to replace boxing/reflection? Generics. Indeed there is a draft PR for a typed binding system (https://github.com/AvaloniaUI/Avalonia/pull/5510) which uses generics throughout the binding chain instead of boxing. We may well have to reconsider this direction.

Feels like we're in a lose-lose situation here. You push one end in and the other pops out.

grokys avatar Mar 15 '21 13:03 grokys

Anyway back to investigating the fluent theme... Given that looking up resources causes so many calls trying to look for each resource, my next experiment was write a tool to merge the fluent theme into a single XAML file and use that.

Running with the merged fluent theme reduced our resource lookups from 50,000 to 640! 🎉

With no noticeable improvement to startup time. 😢

Seems that looping over objects and doing dictionary lookups is actually pretty fast, even if you do it a lot.

Looking at the profiling information in dotMemory shows us that most of the time in building the fluent theme is unaccounted for in user code:

100.00%   xaml  •  1,030 ms  •  1 call  •  CompiledAvaloniaXaml.!AvaloniaResources.xaml(IServiceProvider)
  54.16%   xaml  •  558 ms  •  1 call  •  CompiledAvaloniaXaml.!AvaloniaResources.xaml(IServiceProvider, Styles)
    ►7.38%   Bind  •  76 ms  •  13 calls  •  Avalonia.AvaloniaObjectExtensions.Bind(IAvaloniaObject, AvaloniaProperty, IBinding, Object)
    ►5.19%   ContextMenu..ctor  •  53 ms  •  1 call  •  Avalonia.Controls.ContextMenu..ctor
    ►3.33%   Add  •  34 ms  •  3 calls  •  Avalonia.Collections.AvaloniaList`1.Add(Object)
    ►3.06%   MenuItem..ctor  •  31 ms  •  3 calls  •  Avalonia.Controls.MenuItem..ctor
    ►3.01%   set_Color  •  31 ms  •  328 calls  •  Avalonia.Media.SolidColorBrush.set_Color(Color)
    ►2.46%   CalendarItem..cctor  •  25 ms  •  1 call  •  Avalonia.Controls.Primitives.CalendarItem..cctor
    ►2.33%   EndInit  •  24 ms  •  4 calls  •  Avalonia.Controls.Primitives.SelectingItemsControl.EndInit
    ►2.28%   Parse  •  23 ms  •  8 calls  •  Avalonia.Media.StreamGeometry.Parse(String)
    ►1.75%   Parse  •  18 ms  •  2 calls  •  Avalonia.Media.Transformation.TransformOperations.Parse(String)
    ►0.95%   ScrollViewer..cctor  •  10 ms  •  1 call  •  Avalonia.Controls.ScrollViewer..cctor
    ►0.91%   set_Opacity  •  9 ms  •  19 calls  •  Avalonia.Media.Brush.set_Opacity(Double)
    ►0.90%   get_CutGesture  •  9 ms  •  1 call  •  Avalonia.Controls.TextBox.get_CutGesture
    ►0.83%   Path..cctor  •  9 ms  •  1 call  •  Avalonia.Controls.Shapes.Path..cctor
    ►0.83%   ProvideValue  •  9 ms  •  657 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.DynamicResourceExtension.ProvideValue(IServiceProvider)
    ►0.77%   ProvideValue  •  8 ms  •  621 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.StaticResourceExtension.ProvideValue(IServiceProvider)
    ►0.75%   ComboBox..cctor  •  8 ms  •  1 call  •  Avalonia.Controls.ComboBox..cctor
    ►0.74%   Parse  •  8 ms  •  8 calls  •  Avalonia.Animation.Easings.Easing.Parse(String)
    ►0.70%   GridSplitter..cctor  •  7 ms  •  1 call  •  Avalonia.Controls.GridSplitter..cctor
    ►0.67%   ProvideValue  •  7 ms  •  37 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.ReflectionBindingExtension.ProvideValue(IServiceProvider)
    ►0.64%   SolidColorBrush..ctor  •  7 ms  •  335 calls  •  Avalonia.Media.SolidColorBrush..ctor
    ►0.62%   AutoCompleteBox..cctor  •  6 ms  •  1 call  •  Avalonia.Controls.AutoCompleteBox..cctor
    ►0.48%   Parse  •  5 ms  •  1 call  •  Avalonia.Input.Cursor.Parse(String)
    ►0.45%   SplitView..cctor  •  5 ms  •  1 call  •  Avalonia.Controls.SplitView..cctor
    ►0.44%   Grid..cctor  •  5 ms  •  1 call  •  Avalonia.Controls.Grid..cctor
    ►0.42%   TabItem..cctor  •  4 ms  •  1 call  •  Avalonia.Controls.TabItem..cctor
    ►0.41%   set_Header  •  4 ms  •  3 calls  •  Avalonia.Controls.Primitives.HeaderedSelectingItemsControl.set_Header(Object)
    ►0.40%   TickBar..cctor  •  4 ms  •  1 call  •  Avalonia.Controls.TickBar..cctor
    ►0.39%   RotateTransform..ctor  •  4 ms  •  6 calls  •  Avalonia.Media.RotateTransform..ctor
    ►0.38%   set_IterationCount  •  4 ms  •  4 calls  •  Avalonia.Animation.Animation.set_IterationCount(IterationCount)
    ►0.36%   TopLevel..cctor  •  4 ms  •  1 call  •  Avalonia.Controls.TopLevel..cctor
    ►0.36%   Slider..cctor  •  4 ms  •  1 call  •  Avalonia.Controls.Slider..cctor
    ►0.35%   Add  •  4 ms  •  1,191 calls  •  Avalonia.Collections.AvaloniaDictionary`2.Add(TKey, TValue)
    ►0.35%   set_Duration  •  4 ms  •  7 calls  •  Avalonia.Animation.Animation.set_Duration(TimeSpan)
    ►0.34%   set_FillMode  •  3 ms  •  3 calls  •  Avalonia.Animation.Animation.set_FillMode(FillMode)
    ►0.33%   Parse  •  3 ms  •  12 calls  •  Avalonia.Animation.KeySpline.Parse(String, CultureInfo)
    ►0.32%   Parse  •  3 ms  •  4 calls  •  Avalonia.Animation.IterationCount.Parse(String)
    ►0.29%   Parse  •  3 ms  •  10 calls  •  Avalonia.Media.Geometry.Parse(String)
    ►0.27%   Track..cctor  •  3 ms  •  1 call  •  Avalonia.Controls.Primitives.Track..cctor
    ►0.24%   Expander..cctor  •  3 ms  •  1 call  •  Avalonia.Controls.Expander..cctor
    ►0.23%   OfType  •  2 ms  •  1,086 calls  •  Avalonia.Styling.Selectors.OfType(Selector, Type)
    ►0.22%   set_Angle  •  2 ms  •  6 calls  •  Avalonia.Media.RotateTransform.set_Angle(Double)
    ►0.19%   ConvertFrom  •  2 ms  •  14 calls  •  Avalonia.Media.BrushConverter.ConvertFrom(ITypeDescriptorContext, CultureInfo, Object)
    ►0.18%   Add  •  2 ms  •  545 calls  •  Avalonia.Styling.Styles.Add(IStyle)
    ►0.15%   get_Resources  •  2 ms  •  1,191 calls  •  Avalonia.Styling.Styles.get_Resources
    ►0.14%   Class  •  1 ms  •  572 calls  •  Avalonia.Styling.Selectors.Class(Selector, String)
    ►0.14%   FontFamily..ctor  •  1 ms  •  8 calls  •  Avalonia.Media.FontFamily..ctor(Uri, String)
    ►0.13%   DeferredTransformationFactoryV1  •  1 ms  •  85 calls  •  Avalonia.Markup.Xaml.XamlIl.Runtime.XamlIlRuntimeHelpers.DeferredTransformationFactoryV1(Func, IServiceProvider)
    ►0.11%   Complete  •  1 ms  •  1 call  •  Avalonia.Controls.NameScope.Complete
    ►0.11%   ProvideValue  •  1 ms  •  10 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.RelativeSourceExtension.ProvideValue(IServiceProvider)
    ►0.10%   xaml..cctor  •  1 ms  •  1 call  •  CompiledAvaloniaXaml.!AvaloniaResources+NamespaceInfo:/FluentDark.xaml..cctor
    ►0.09%   ContentPresenter..cctor  •  1 ms  •  1 call  •  Avalonia.Controls.Presenters.ContentPresenter..cctor
    ►0.08%   CrossFade..ctor  •  1 ms  •  1 call  •  Avalonia.Animation.CrossFade..ctor
    ►0.07%   ConvertFrom  •  1 ms  •  5 calls  •  Avalonia.Media.TransformConverter.ConvertFrom(ITypeDescriptorContext, CultureInfo, Object)
    ►0.07%   Add  •  1 ms  •  62 calls  •  Avalonia.Collections.AvaloniaList`1.Add(T)
    ►0.07%   BeginInit  •  1 ms  •  4 calls  •  Avalonia.Controls.Primitives.SelectingItemsControl.BeginInit
    ►0.06%   Style..ctor  •  1 ms  •  545 calls  •  Avalonia.Styling.Style..ctor
    ►0.06%   set_Value  •  1 ms  •  1,143 calls  •  Avalonia.Styling.Setter.set_Value(Object)
    ►0.06%   Thickness..ctor  •  1 ms  •  160 calls  •  Avalonia.Thickness..ctor(Double, Double, Double, Double)
    ►0.05%   Animation..ctor  •  1 ms  •  7 calls  •  Avalonia.Animation.Animation..ctor
    ►0.05%   CornerRadius..ctor  •  1 ms  •  5 calls  •  Avalonia.CornerRadius..ctor(Double, Double, Double, Double)
    ►0.04%   PopParent  •  0 ms  •  1,390 calls  •  CompiledAvaloniaXaml.XamlIlContext+Context`1.PopParent
    ►0.04%   Name  •  0 ms  •  413 calls  •  Avalonia.Styling.Selectors.Name(Selector, String)
    ►0.04%   Template  •  0 ms  •  473 calls  •  Avalonia.Styling.Selectors.Template(Selector)
    ►0.04%   DataValidationErrors..cctor  •  0 ms  •  1 call  •  Avalonia.Controls.DataValidationErrors..cctor
    ►0.03%   Canvas..cctor  •  0 ms  •  1 call  •  Avalonia.Controls.Canvas..cctor
     0.03%   48 functions hidden  •  0 ms total  •  1238 calls total
    ►0.03%   DynamicResourceExtension..ctor  •  0 ms  •  657 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.DynamicResourceExtension..ctor(Object)
    ►0.03%   DoubleTransition..ctor  •  0 ms  •  11 calls  •  Avalonia.Animation.DoubleTransition..ctor
    ►0.03%   set_KeySpline  •  0 ms  •  12 calls  •  Avalonia.Animation.KeyFrame.set_KeySpline(KeySpline)
    ►0.03%   set_InputGesture  •  0 ms  •  3 calls  •  Avalonia.Controls.MenuItem.set_InputGesture(KeyGesture)
    ►0.03%   LayoutTransformControl..cctor  •  0 ms  •  1 call  •  Avalonia.Controls.LayoutTransformControl..cctor
    ►0.02%   Or  •  0 ms  •  21 calls  •  Avalonia.Styling.Selectors.Or(IReadOnlyList)
    ►0.02%   NotificationCard..cctor  •  0 ms  •  1 call  •  Avalonia.Controls.Notifications.NotificationCard..cctor
    ►0.02%   Setter..ctor  •  0 ms  •  1,143 calls  •  Avalonia.Styling.Setter..ctor
    ►0.02%   PropertyEquals  •  0 ms  •  39 calls  •  Avalonia.Styling.Selectors.PropertyEquals(Selector, AvaloniaProperty, Object)
    ►0.02%   Parse  •  0 ms  •  2 calls  •  Avalonia.RelativePoint.Parse(String)
    ►0.02%   TransformOperationsTransition..ctor  •  0 ms  •  3 calls  •  Avalonia.Animation.TransformOperationsTransition..ctor
    ►0.02%   get_Setters  •  0 ms  •  1,115 calls  •  Avalonia.Styling.Style.get_Setters
    ►0.02%   set_KeyTime  •  0 ms  •  12 calls  •  Avalonia.Animation.KeyFrame.set_KeyTime(TimeSpan)
    ►0.02%   Resources!Property  •  0 ms  •  503 calls  •  CompiledAvaloniaXaml.XamlIlHelpers.Resources!Property
    ►0.02%   StaticResourceExtension..ctor  •  0 ms  •  118 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.StaticResourceExtension..ctor(Object)
    ►0.02%   PushParent  •  0 ms  •  1,390 calls  •  CompiledAvaloniaXaml.XamlIlContext+Context`1.PushParent(Object)
    ►0.02%   XamlIlContext+Context`1..ctor  •  0 ms  •  1 call  •  CompiledAvaloniaXaml.XamlIlContext+Context`1..ctor(IServiceProvider, Object[], String)
    ►0.02%   TemplateBinding..ctor  •  0 ms  •  10 calls  •  Avalonia.Data.TemplateBinding..ctor(AvaloniaProperty)
    ►0.02%   Child  •  0 ms  •  19 calls  •  Avalonia.Styling.Selectors.Child(Selector)
    ►0.02%   Not  •  0 ms  •  7 calls  •  Avalonia.Styling.Selectors.Not(Selector, Selector)
    ►0.02%   Is  •  0 ms  •  6 calls  •  Avalonia.Styling.Selectors.Is(Selector, Type)
     0.02%   AddWithResize  •  0 ms  •  628 calls  •  System.Collections.Generic.List`1.AddWithResize(T)
    ►0.02%   Descendant  •  0 ms  •  22 calls  •  Avalonia.Styling.Selectors.Descendant(Selector)
    ►0.02%   Parse  •  0 ms  •  7 calls  •  Avalonia.Animation.Cue.Parse(String, CultureInfo)
    ►0.01%   ReflectionBindingExtension..ctor  •  0 ms  •  27 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.ReflectionBindingExtension..ctor(String)
    ►0.01%   CornerRadiusTransition..ctor  •  0 ms  •  1 call  •  Avalonia.Animation.CornerRadiusTransition..ctor
     0.01%   set_Property  •  0 ms  •  1,143 calls  •  Avalonia.Styling.Setter.set_Property(AvaloniaProperty)
    ►0.01%   Register  •  0 ms  •  4 calls  •  Avalonia.Controls.NameScope.Register(String, Object)
    ►0.01%   FromUInt32  •  0 ms  •  264 calls  •  Avalonia.Media.Color.FromUInt32(UInt32)
    ►0.01%   StaticResourceExtension..ctor  •  0 ms  •  503 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.StaticResourceExtension..ctor
    ►0.01%   set_Easing  •  0 ms  •  3 calls  •  Avalonia.Animation.Animation.set_Easing(Easing)
    ►0.01%   RelativeSourceExtension..ctor  •  0 ms  •  10 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.RelativeSourceExtension..ctor(RelativeSourceMode)
    ►0.01%   GridLength..ctor  •  0 ms  •  5 calls  •  Avalonia.Controls.GridLength..ctor(Double, GridUnitType)
    ►0.01%   KeyFrame..ctor  •  0 ms  •  19 calls  •  Avalonia.Animation.KeyFrame..ctor
     0.01%   Add  •  0 ms  •  1,172 calls  •  System.Collections.Generic.List`1.Add(T)
    ►0.01%   FocusAdornerTemplate..ctor  •  0 ms  •  2 calls  •  Avalonia.Markup.Xaml.Templates.FocusAdornerTemplate..ctor
     0.01%   set_Selector  •  0 ms  •  545 calls  •  Avalonia.Styling.Style.set_Selector(Selector)
    ►0.01%   TransformConverter..ctor  •  0 ms  •  5 calls  •  Avalonia.Media.TransformConverter..ctor
    ►0.01%   Transitions..ctor  •  0 ms  •  15 calls  •  Avalonia.Animation.Transitions..ctor
    ►0.01%   WrapPanel..cctor  •  0 ms  •  1 call  •  Avalonia.Controls.WrapPanel..cctor
     0.01%   set_ResourceKey  •  0 ms  •  503 calls  •  Avalonia.Markup.Xaml.MarkupExtensions.StaticResourceExtension.set_ResourceKey(Object)
  ►0.00%   Styles..ctor  •  0 ms  •  1 call  •  Avalonia.Styling.Styles..ctor

Which again points to the culprit being JIT.

grokys avatar Mar 15 '21 13:03 grokys

Given that the slowdown seems to be due to just building the styles, it seems obvious that one solution might be to load less styles.

My next experiment adds a "filter" to the <StyleInclude>s in the fluent theme such that the source style XAML will only be loaded when a matching control is encountered. For example:

  <FluentStyleInclude Source="avares://Avalonia.Themes.Fluent/Controls/Carousel.xaml" Filter="{x:Type Carousel}"/>

The Filter property here is evaluated when running IStyle.TryAttach and if it doesn't match the control being styled then the style is not loaded. This is a bit of a hack at this point and means that resources need to be moved to FluentBaseDark/FluentBaseLight or they won't be found, and some controls are still glitchy but I wanted to see if it was likely to work before thinking about how to do this better, and...

Yep, it works. Startup time for a blank window went from 1s to 0.71s (yes my computer seems to have slowed down since I did the tests in the original post) which is a +25% speedup for a blank window.

Conclusion: we need to look into this more and see if we can generalize/unhackize this solution.

grokys avatar Mar 15 '21 16:03 grokys

will this approach be pursued further in the future?

im currently try to speed up the startup time of a simple app with just a TextBox, a Slider and a Button. On my iMac (i9, 32GB DDR4) it needs about 1400ms from App start to first RendererOnSceneInvalidated Event. On a Toradex IMX8M Mini (NXP® i.MX 8M Mini, 1GB LPDDR4) it almost took 4200ms.

StopWatch started in Programm.Main, step on App.cs:Initialize, step on App.cs.Initialize after AvaloniaXamlLoader.Load(this); and last step on first call of Renderer.SceneInvalidated.

Trace on IMX8M Mini: Total:24.2916 ms Step:26 ms: Application start ... Total:1150.0592 ms Step:1105 ms: App-Initialize Total:2490.6984 ms Step:1339 ms: App AvaloniaXamlLoader loaded Total:4189.2833 ms Step:1698 ms: First time rendered

DrWenz avatar Jul 01 '22 06:07 DrWenz

You need to use R2R. Startup time is increased mostly due to JIT.

kekekeks avatar Jul 01 '22 07:07 kekekeks

I already build with R2R:
dotnet publish -c Release -o publish -r linux-arm64 -p:PublishReadyToRun=true -p:PublishSingleFile=true -p:PublishTrimmed=true --self-contained true

DrWenz avatar Jul 01 '22 07:07 DrWenz

Yes, #8263 and #8422 should help with this when they're finished, though we don't have any numbers yet.

grokys avatar Jul 02 '22 00:07 grokys

4189.2833ms is still sounds very wrong. Not sure if it's platform specific, but on m1 and intel windows I never had so bad numbers even on slower laptop. With r2r setup.

maxkatz6 avatar Jul 02 '22 02:07 maxkatz6

On iMac or any other windows machine there are significant better startup times around 1 - 1.5 seconds. The 4.2 seconds measured on a IMX8M mini as we want to use Avalonia for embeeded solutions ❤️ once the app is started it runs very good on around 37fps 😄 the startup time is currently no reason we don't will got with avalonia, but is a bit of a downer in terms of UX for embeeded solutions

DrWenz avatar Jul 02 '22 13:07 DrWenz

@grokys Should this one be closed now?

robloo avatar Oct 15 '22 01:10 robloo