Memory leak in master branch (automation related?)
Describe the bug The latest master branch (tested 11.0.4 and 0.10.999-cibuild0020870-beta https://github.com/AvaloniaUI/Avalonia/commit/cea6bc27a06f25b4f2d3a73d3411e904b1767e74) has a memory leak, probably related to Automation code.
Note: this is a regression from the current stable 0.10.x version (which doesn't have automation related modules).
To Reproduce The repro is very simple - just add a button, press it and the button will not be freed anymore.
- Clone the repo: https://github.com/BAndysc/AvaloniaMenuLeak
- Checkout branch
leak2 - Run the app
- The top text shows managed used memory, in the beginning it is less than 3 MBs
- Press the button Add control
- This will spawn a control, that has a reference to 1GB byte array, so obviously the ram usage will rise up to 1 GB
- Press Remove Control
- GC will free the control and datacontext almost instantly, and it will go down to 3 MBs, so far so good.
- Press Add control again
- This time click the button 'Click me'
- Press Remove control
- Notice this time the memory usage doesn't drop. The control is not freed!
Screenshots
dotMemory screenshot:

Desktop:
- OS: Mac
- Version: 0.10.999-cibuild0020870-beta (it works fine on 0.10.14)
Possibly related to #7751
Pretty weird, I used dotMemory as well to fix previous memory leak, and I used latest master (well, couple of days old now). And there was only one memory leak with old sample code.
@grokys I will take a look tomorrow, but if you know something about automation what might be a reason here, it can help.
@robloo while I am on it, I can fix that one too. Thanks for reminding.
Pretty weird, I used dotMemory as well to fix previous memory leak, and I used latest master (well, couple of days old now). And there was only one memory leak with old sample code.
if you mean my sample code, then the previous one (menu related) didn't have this button, I've discovered it accidentally, while trying to reproduce another leak
@maxkatz6 If you want to close #7751 great! It's otherwise still on my list to complete before 11.0 but it's one of the lowest priority.
Yeah, reproduced using Accessibility Insights to trigger automation.
Got three weak references that live after button was detached:

I think the reason for this memory leak is that the method responsible for removing one, two or three weak references is simply not called: I'm talking about Avalonia.Win32.Automation.Automation.AutomationNode.Release method:
I was told by @maxkatz6 that @grokys may know where this method can/should be called. Pinged the second one, hopefully he will reply and help with fix.
Hey @flexxxxer I just talked to @grokys and asked him if he has any idea. Unfortunately, this seems to be hard to debug as e.g. "dotmemory can't really help with COM leaks". So it may take quite a while for us to find the root cause and track this down. Thank you for your hands off on this :-)
Thanks @timunie . To address the usage of Release - I believe this should be called by the win32 automation APIs but I may be mistaken, need to find a little time to actually look into it (I may have this completely wrong rn, can't really remember offhand how it all works).
I have idea how to fix that (where call Release), will try implement it in few days
This problem still happens with latest master (11.2.999-cibuild0047389-alpha), but only on macOS
I am trying to understand the code, it would be great if anybody could go through this and help me understand it.
Native AvnAccessibilityElement holds a pointer to IAvnAutomationPeer.
https://github.com/AvaloniaUI/Avalonia/blob/d5ca06ff2432dcc4671a4ed795a5ce6a5cdadddc/native/Avalonia.Native/src/OSX/automation.mm#L56-L60
It is saved into a field here:
https://github.com/AvaloniaUI/Avalonia/blob/d5ca06ff2432dcc4671a4ed795a5ce6a5cdadddc/native/Avalonia.Native/src/OSX/automation.mm#L93-L100
And initWithPeer is called from acquire:
https://github.com/AvaloniaUI/Avalonia/blob/d5ca06ff2432dcc4671a4ed795a5ce6a5cdadddc/native/Avalonia.Native/src/OSX/automation.mm#L63-L91
acquire is called from few places. I.e. from accessibilityHitTest, afaik called by the OS when the user press a pointer
https://github.com/AvaloniaUI/Avalonia/blob/d5ca06ff2432dcc4671a4ed795a5ce6a5cdadddc/native/Avalonia.Native/src/OSX/automation.mm#L476-L482
hit comes from the RootProvider_GetPeerFromPoint call, which goes to c# world via microcom generated code (code reformated for visibility, i.e. removed try{}catch{}):
static void* RootProvider_GetPeerFromPoint(void* @this, AvnPoint point)
{
IAvnAutomationPeer __target = (IAvnAutomationPeer)global::MicroCom.Runtime.MicroComRuntime.GetObjectFromCcw(new IntPtr(@this));
var __result = __target.RootProvider_GetPeerFromPoint(point);
return global::MicroCom.Runtime.MicroComRuntime.GetNativePointer(__result, true);
}
Lastly, let's see MicroComRuntime.GetNativePointer (also removed parts that are not important)
public static void* GetNativePointer<T>(T obj, bool owned = false) where T : IUnknown
{
if (obj is IMicroComShadowContainer container)
{
container.Shadow ??= new MicroComShadow(container);
void* ptr = null;
var res = container.Shadow.GetOrCreateNativePointer(typeof(T), &ptr);
if (owned)
container.Shadow.AddRef((Ccw*)ptr);
return ptr;
}
throw new ArgumentException("Unable to get a native pointer for " + obj);
}
So it creates MicroComShadow if needed, then increment a reference counter (AddRef). Wait a minute. So if this method increases the reference counter, somewhere it should be decremented. But where? I don't see in the code when those objects can be deallocated. MicroComShadow is the object that keeps the reference to automation classes, which eventually holds references to view classes. Please correct me if am wrong.
https://github.com/AvaloniaUI/Avalonia/blob/master/src/Avalonia.Native/AvnAutomationPeer.cs is the managed part. So if that is cleaned up it should also clean up the native part.
@Gillibald
As far as I understand, the problem is that the Native wrapper - MicroComShadow keeps a strong reference to AvnAutomationPeer, while MicroComShadow itself if rooted, thus never collected, neither AvnAutomationPeer nor Control nor DataContext in result, leading to massive leaks.
https://github.com/kekekeks/MicroCom/blob/2f698e682efba38feeaf4e84ca9a39052fe6cab3/src/MicroCom.Runtime/MicroComShadow.cs#L15-L20
internal IMicroComShadowContainer Target { get; }
internal MicroComShadow(IMicroComShadowContainer target)
{
Target = target;
Target.Shadow = this;
}
I am not exactly why MicroComShadow gets rooted (it makes sense since it has to be accessible from the unmanaged code, but I am not sure when it exactly it happens).
I always was suspicious of these IAvnString usages: https://github.com/AvaloniaUI/Avalonia/blob/master/src/Avalonia.Native/AvnAutomationPeer.cs#L32-L40
You can clearly see that AvnString is created on C# side, but never disposed.
For file picker code I had a hack like this, ensuring that all of these strings will be disposed: https://github.com/AvaloniaUI/Avalonia/blob/27e40b820e667c4ab1c7edacf4d3c65dfe3c7f4a/src/Avalonia.Native/SystemDialogs.cs#L131-L148 Which are all disposed together after file picked: https://github.com/AvaloniaUI/Avalonia/blob/27e40b820e667c4ab1c7edacf4d3c65dfe3c7f4a/src/Avalonia.Native/SystemDialogs.cs#L56
As well as on ObjC side all strings (ideally) should be released as soon as possible. For example, right when they are converted to NSStrings: https://github.com/AvaloniaUI/Avalonia/blob/27e40b820e667c4ab1c7edacf4d3c65dfe3c7f4a/native/Avalonia.Native/src/OSX/SystemDialogs.mm#L418
But yes, whole AutomationPeer leaking is a bigger object/problem than some strings.
This issue is also on windows. We are using this with UIAutomation and we have that here. There is any news about it ?