AvalonDock icon indicating copy to clipboard operation
AvalonDock copied to clipboard

XmlLayoutSerialer memory leak

Open pkindruk opened this issue 2 years ago • 5 comments

Pull request #308 introduced memory leak each time you serialize/deserialize layout. Affected versions: 4.60.1 and above. This behavior is persistent across multiple runtimes net48, netcoreapp3.1, net 5.0, net 6.0. Our projects has auto-save layout feature so memory grows pretty quickly.

Plain C# repro:

using System;
using System.IO;
using System.Xml.Serialization;

namespace Net5Sandbox
{
    public class XmlRoot
    {
        public int Id { get; set; }
    }

    class Program
    {
        static string Serialize(XmlRoot dataObj)
        {
            //var serializer = new XmlSerializer(typeof(XmlRoot)); //doesn't leak
            var serializer = XmlSerializer.FromTypes(new[] { typeof(XmlRoot) })[0]; //leaks
            using (var stream = new StringWriter())
            {
                serializer.Serialize(stream, dataObj);
                return stream.ToString();
            }
        }


        static void Main(string[] args)
        {
            const int repeatCount = 10000;

            var totalLength = 0L;
            var dataObj = new XmlRoot { Id = 1 };
            for (int i = 0; i < repeatCount; i++)
            {
                totalLength += Serialize(dataObj).Length;
            }

            Console.WriteLine($"Done total length = {totalLength}");

            Console.WriteLine(GC.GetTotalMemory(false));
            Console.WriteLine("Running GC");
            for (int i = 0; i < 10; i++)
            {
                GC.Collect(2, GCCollectionMode.Forced, true, true);
            }
            Console.WriteLine("Done");
            Console.WriteLine(GC.GetTotalMemory(false));
            Console.ReadKey();
        }
    }
}

Final console app memory consumption is around 500MiB depending on runtime. However GC.GetTotalMemory() shows that it is not a object allocation issue. Console output:

Done total length = 1670000
13077480
Running GC
Done
4912128

Debug output with XmlSerializer.FromTypes(Type[]) contains:

... ommited for clarity
'Net5Sandbox.exe' (CoreCLR: clrhost): Loaded 'Microsoft.GeneratedCode'. 
... (10000 times)
'Net5Sandbox.exe' (CoreCLR: clrhost): Loaded 'Microsoft.GeneratedCode'. 
... ommited for clarity

While debug output with XmlSerializer.ctor(Type) contains:

... ommited for clarity
'Net5Sandbox.exe' (CoreCLR: clrhost): Loaded 'Microsoft.GeneratedCode'.  // exactly once 
... ommited for clarity

Looks like XmlSerializer.FromTypes(Type[]) generates dynamic assembly or type with serialization logic every time we call it in runtime. Which stays there until app restart and causes memory leak.

I see a few solution to this issue:

  1. Get back to XmlSerializer.ctor(Type). However this will get us back to System.IO.FileNotFoundException which is still better than memory leak.
  2. Cache XmlSerializer.FromTypes(Type[]) result. However this requires changing serialization logic.
  3. Use Xml Serializer Generator. Never done that, especially with nuget package, might require manual .nuspec.

pkindruk avatar May 25 '22 13:05 pkindruk

@RadvileSaveraiteFemtika

pkindruk avatar May 25 '22 13:05 pkindruk

Well, we are no longer on that project with @RadvileSaveraiteFemtika. But @pkindruk tells true.
https://github.com/microsoft/referencesource/blob/master/System.Xml/System/Xml/Serialization/XmlSerializer.cs line 628 It is using some kind of reflection and it is slow. Even documentation tells results should be cached without explanation why... I guess now it is clear why... the first one works fast because results are internally cached by Microsoft.

@Dirkster99 he suggested three approaches. I could make changes for the second one (caching) or you can simply do the first one: revert the changes. Waiting for you answer.

audryste avatar Aug 09 '22 06:08 audryste

It is using some kind of reflection and it is slow. Even documentation tells results should be cached without explanation why... I guess now it is clear why...

I am finding that having a debugger attached while serializing layout causes it to take several seconds to complete. Is that related to the heavy reflection and leaking? I didn't notice the memory leak myself but it would be affecting me because I do auto layout saving very often. Maybe there is an earlier package version you could suggest without the performance issue?

romen-h avatar Sep 13 '22 18:09 romen-h

It is all written if first post @romen-h Affected versions: 4.60.1 and above. Use anything below < 4.60.1.

audryste avatar Sep 14 '22 07:09 audryste

@audryste If you could prepare a PR to implement the recommended implementation that comes without memory leaks, I'd be more than happy to apply it :-)

Dirkster99 avatar Sep 22 '22 18:09 Dirkster99