Avalonia Reduce Allocations in Text Rendering

What does the pull request do?

This PR addresses some performance regressions and reduces overall allocations when rendering a lot of (complex) text runs.

What is the current behavior?

When I upgraded to the latest version of Avalonia and .NET 10 in AvaloniaHex, I noticed a pretty significant bump in memory allocations and performance degradation in rendering of many (complex) text runs. Upon profiling, I found a couple of related hotspots:

In Avalonia.Media.TextFormatting there are various precompiled tries stored as raw data. This data is recreated on every access of the trie. E.g., here is UnicodeData.trie:

https://github.com/AvaloniaUI/Avalonia/blob/0b5a82884e2cfa24499f5f05fc84e0dd52c734dd/src/Avalonia.Base/Media/TextFormatting/Unicode/UnicodeData.trie.cs#L15-L22

These computed tries are used quite extensively throughout the library. This results in a significant number of unnecessary allocations of very large data blobs (literally millions of instances), significantly slowing down controls that do a lot of (complex) text rendering. Below, an example of how AvaloniaHex is affected when scrolling once down and up in the example project:
FontFamily.Parse eventually calls FontFamily.GetFontSourceIdentifier, which always allocates extra string and string[] instances when parsing a font by name, even if the font name is a simple font without any fallback fonts:
FontShaperImpl.ShapeText in Avalonia.Skia uses a language cache with a GetOrAdd construction that uses a non-static / closure capturing lambda for its factory.

https://github.com/AvaloniaUI/Avalonia/blob/0b5a82884e2cfa24499f5f05fc84e0dd52c734dd/src/Skia/Avalonia.Skia/TextShaperImpl.cs#L45-L47

This results in a closure being created for every text run that isn't used most of the time because the used culture doesn't change in most cases:
LineBreakEnumerator always creates a LineBreakState heap allocation (even if a text run does not contain line breaks). This is wasteful, especially considering LineBreakEnumerator is already a ref struct and will never appear on the heap:

What is the updated/expected behavior with this PR?

This PR removes the vast majority of these allocations. Running the example project of AvaloniaHex with these changes applied to Avalonia makes scrolling much smoother.

How was the solution implemented (if it's not obvious)?

In chronological order of the issues described above:

UnicodeDataGenerator was updated to generate code with a readonly Data field as opposed to a computed property. This removes all RuntimeFieldInfoStub allocations.
FontFamily.GetFontSourceIdentifier was rewritten to avoid any array allocations and most string reallocations using ReadOnlySpan<char>s and slicing.
FontShaperImpl.ShapeText now uses the overload of GetOrAdd that passes an argument to the factory lambda.
LineBreakEnumerator+LineBreakState was turned into a ref struct and is now passed along as a ref parameter to all unicode rule methods.

Checklist

[ ] Added unit tests (if possible)?
[ ] Added XML documentation to any related classes?
[ ] Consider submitting a PR to https://github.com/AvaloniaUI/avalonia-docs with user documentation

Breaking changes

None. All changes are made on internal or private APIs.

Obsoletions / Deprecations

None

Fixed issues

Related to #16390

Additional Questions

Maybe out of this scope for this PR, but I am still seeing a lot of instances of SKFont (scaling linearly with the number of text runs I create) in AvaloniaHex, even though I reuse the same Typeface instances as much as possible. Is there any possibility/talks on caching SKFont instances?

Nov 28 '25 16:11 Washi1337

You can test this PR using the following package version. 12.0.999-cibuild0060453-alpha. (feed url: https://nuget-feed-all.avaloniaui.net/v3/index.json) [PRBUILDID]

Nov 28 '25 16:11 avaloniaui-bot

We can try to reuse one SKFont instance and change its properties before we call some API that needs it. Not sure how costly mutating it is. If that isn't improving anything, we can cache SKFont instances per font size.

Thank you for your contribution

Nov 28 '25 17:11 Gillibald

We can try to reuse one SKFont instance and change its properties before we call some API that needs it.

I am not sure this would help much, because every GlyphRunImpl at the moment creates a new SKFont. So even if we share typefaces or only slightly change properties of some public Avalonia text/font-related types, it would still be recreating a SKFont.

we can cache SKFont instances per font size.

This is what I was thinking as well, though, we would also need to cache it by edging type. Most of the instances seem to come from this method:

https://github.com/AvaloniaUI/Avalonia/blob/0b5a82884e2cfa24499f5f05fc84e0dd52c734dd/src/Skia/Avalonia.Skia/GlyphRunImpl.cs#L135-L144

I am not entirely sure what the best approach would be, do you think we should have GlyphTypefaceImpl cache them?

Nov 28 '25 17:11 Washi1337

Thank you for your contribution!

I'll review and test in depth when I have a bit more time, but the first point seems very strange to me. Which OS, architecture and exact runtime are you using?

For a few versions of the C# compiler now, ReadOnlySpan<T> of primitive types with constant values are actually embedded directly inside the assembly. You get a simple pointer to the static data at runtime, without having to allocate heap memory at all. Said another way, new[] doesn't allocate in this case (nowadays, collection expressions should be used to make that more obvious). #15074 implemented this.

A quick check with https://godbolt.org/z/KP1xnE5Yj shows that this hasn't changed at all and still works as expected in .NET 10. I'll look at Avalonia in details as soon as I can :)

Nov 28 '25 17:11 MrJul

We can try to reuse one SKFont instance and change its properties before we call some API that needs it.

I am not sure this would help much, because every GlyphRunImpl at the moment creates a new SKFont. So even if we share typefaces or only slightly change properties of some public Avalonia text/font-related types, it would still be recreating a SKFont.

we can cache SKFont instances per font size.

This is what I was thinking as well, though, we would also need to cache it by edging type. Most of the instances seem to come from this method:

https://github.com/AvaloniaUI/Avalonia/blob/0b5a82884e2cfa24499f5f05fc84e0dd52c734dd/src/Skia/Avalonia.Skia/GlyphRunImpl.cs#L135-L144

I am not entirely sure what the best approach would be, do you think we should have GlyphTypefaceImpl cache them?

Yes, GlyphTypefaceImpl would cache them

Nov 28 '25 17:11 Gillibald

I'll review and test in depth when I have a bit more time, but the first point seems very strange to me. Which OS, architecture and exact runtime are you using?

Yes, I also found it quite strange and something that probably would've been caught by you guys already.

Arch: x64
OS: NixOS 25.11/unstable (running in an FHS devshell with x11 libraries in PATH), Kernel 6.12.58
WM: Hyprland 0.52.1
Editor: JetBrains Rider 2025.3.
dotnet: 10.0.100 (but also have other versions installed)

After posting the PR I double-checked my build configs and it seemed I ran my tests under the DEBUG config ( Sharplab seems to confirm this too.). My bad, I should've run all tests in RELEASE mode. Nonetheless, this change may still be worth it for speeding up debug builds :), also the other issues are still present even in release mode.

Nov 28 '25 18:11 Washi1337

We can try to reuse one SKFont instance and change its properties before we call some API that needs it.

Please, avoid native objects with mutable state in IGlyphRun and friends. Those can be used from multiple threads and it's really easy to introduce hard to track native memory corruption.

Nov 29 '25 11:11 kekekeks

We can try to reuse one SKFont instance and change its properties before we call some API that needs it.

Please, avoid native objects with mutable state in IGlyphRun and friends. Those can be used from multiple threads and it's really easy to introduce hard to track native memory corruption.

So you suggest we can't cache the SkFont in the GlyphTypefaceImpl that already holds a SKTypeface?

IGlyphRunImpl is immutable

Nov 29 '25 18:11 Gillibald

My bad, I should've run all tests in RELEASE mode. Nonetheless, this change may still be worth it for speeding up debug builds :), also the other issues are still present even in release mode.

Even without any performance boost, it's worth doing this solely to prevent other developers from seeing the same alarming number of allocations that you did and wasting their time trying to investigate the source.

Can we not get the best of both worlds like this?

private static ReadOnlySpan<uint> Data { get; } = new uint[] ...

I assume that this would still trigger the compiler optimisation, and it definitely avoids those million+ array allocations at runtime.

Dec 02 '25 15:12 TomEdwardsEnscape

My bad, I should've run all tests in RELEASE mode. Nonetheless, this change may still be worth it for speeding up debug builds :), also the other issues are still present even in release mode.

Even without any performance boost, it's worth doing this solely to prevent other developers from seeing the same alarming number of allocations that you did and wasting their time trying to investigate the source.

Can we not get the best of both worlds like this?
private static ReadOnlySpan<uint> Data { get; } = new uint[] ...
I assume that this would still trigger the compiler optimisation, and it definitely avoids those million+ array allocations at runtime.

Sadly, this is not possible because fields (and by extension, property backing fields) cannot be of type ReadOnlySpan<T> unless it is an instance field of a ref struct. This is why I changed it to a uint[]. Happy to hear other options though that could get rid of the single array allocation.

Dec 02 '25 16:12 Washi1337

We need to keep ReadOnlySpan<uint> Data => new uint[] to get the optimization

Dec 02 '25 16:12 Gillibald

Since getter-only property ("=>") doesn't have any state (that could hypothetically be mutated with reflection), it's easier for the compiler to assume optimizations.

But I don't know if .NET 10 is better at optimizing { get; }.

Dec 02 '25 21:12 maxkatz6

Note regarding the ReadOnlySpan<T>: even in debug mode, I don't see those allocations at all (I was very surprised at the original claim since I remember running dotMemory on debug builds several times). Tried on Windows x64 and macOS ARM64 with the latest master branch. This is a JIT intrinsic so I'm not sure why that happens on your machine.

Dec 05 '25 09:12 MrJul

Note regarding the ReadOnlySpan<T>: even in debug mode, I don't see those allocations at all (I was very surprised at the original claim since I remember running dotMemory on debug builds several times). Tried on Windows x64 and macOS ARM64 with the latest master branch. This is a JIT intrinsic so I'm not sure why that happens on your machine.

I have done some additional validation on DEBUG builds, this time on a fresh Ubuntu 25.04 VM, as well as a Windows 10 x64 VM, both using a fresh .NET 10 installation (obtained through preview ppa on Ubuntu and winget on Windows). I am still seeing the same allocations of System.RuntimeFieldInfoStub across all systems when using computed ReadOnlySpan properties.

How to reproduce:

Compile this code for .NET 10:

Program.cs

using System;
using System.Threading;
using System.Runtime.CompilerServices;

internal class Program
{
    public static void Main(string[] args)
    {
        Thread.Sleep(5000); // Added to give some time to enable full memory allocation tracking
        var random = new Random();
        for (int i = 0; i < 1000000; i++)
        {
            DoSomething(Foo[random.Next(Foo.Length)]);
        }
        Console.WriteLine("Done");
        Thread.Sleep(10000); // Added to give some time to create a snapshot
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static void DoSomething(uint u)
    {
    }

    public static ReadOnlySpan<uint> Foo => new uint[] { 1, 2, 3, 4 };
    public static ReadOnlySpan<uint> Bar => [1, 2, 3, 4];
}

Program.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net10.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

</Project>

Start the dotMemory CLI tool.

$ dotMemory start path/to/Program

Z:\> dotMemory.exe start path\to\Program.exe

Enable full memory tracking during the first Thread.Sleep call:
```
##dotMemory["collect-allocations-on", {pid: xxxx}]
```
Create snapshot during the second Thread.Sleep call:
```
##dotMemory["get-snapshot", {pid: xxxx}]
```

Open the snapshot in dotMemory. Observe the majority of allocations are dominated by System.RuntimeFieldInfoStub:

Allocated type : System.RuntimeFieldInfoStub
  Objects : n/a
  Bytes   : 144000000

Allocated by
   100%  FromPtr • 137.33 MB / 137.33 MB • System.RuntimeFieldInfoStub.FromPtr(IntPtr)
     100%  get_Foo • 137.33 MB / - • global::Program.get_Foo()
       100%  Main • 137.33 MB / - • global::Program.Main(String[])
        ►  100%  [AllThreadsRoot] • 137.33 MB / - • [AllThreadsRoot]

I am happy to revert the changes on ReadOnlySpan<T> for the tries in this PR, but this seems to be a reliably reproducible hotspot. Arguably, this may not be necessarily related to Avalonia, and we may want to move this specific issue to the dotnet/runtime repo tosee what they have to say about it. Let me know what you think :).

EDIT: Dumping the generated x64 code using DOTNET_JitDisasm environment variable also confirms that the get_Foo property does a whole lot more on DEBUG builds than simply returning a static handle to the RVA data.

Dec 07 '25 19:12 Washi1337

Yes you're right, not sure how I missed that last time, or if I wasn't looking at the right thing, sorry about that.

While I still think that keeping things as they are for simplicity is fine (the runtime has tons of ReadOnlySpan<...> => [] usages) and that profiling should only be done in release mode, let's make a change to keep memory allocations low in debug.

Let's generate something like this instead:

#if DEBUG
    public static ReadOnlySpan<uint> Bar => s_bar;
    private static uint[] s_bar =
#else
    public static ReadOnlySpan<uint> Bar =>
#endif
    [1, 2, 3, 4];

Dec 08 '25 10:12 MrJul

You can test this PR using the following package version. 12.0.999-cibuild0060733-alpha. (feed url: https://nuget-feed-all.avaloniaui.net/v3/index.json) [PRBUILDID]

Dec 14 '25 12:12 avaloniaui-bot