playwright-dotnet
playwright-dotnet copied to clipboard
[Bug]: Playwright running in .NET hosted services seems to leak memory
Version
1.45.0
Steps to reproduce
- Clone my repro repo: https://github.com/mu88/Repro_Playwright
- Run
dotnet build - Run
pwsh Web\bin\Debug\net8.0\playwright.ps1 install - Run the app (either via IDE or
dotnet run) - Check the app's log output whether a screenshot is created every 15 s (log message
info: NewScreenshotCreator[0] Screenshot created) - Attach a memory profiler to the app (e.g. dotMemory)
- Create a first memory dump
- Wait some time, e.g. 15 min
- Create another memory dump
Expected behavior
The app should not use more memory over time (both managed and unmanaged).
Actual behavior
The app uses more memory over time (both managed and unmanaged).
Additional context
I quickly analyzed one memory dump. Without further understanding of what's going on in Playwright internally, I don't want to speculate about the unmanaged memory. For the managed memory, however, I already discovered the following:
- Most of the memory is retained by
StdIOTransportand some reflection code. StdIOTransportis instantiated inawait Playwright.CreateAsync()
In the following screenshot, you see the increasing memory footprint over time:
In the next screenshot, you see the dominating types retaining the memory (last memory dump taken):
And last but not least, the following screenshot shows several issues, e.g. duplicate strings, sparse arrays, and leaking event handlers:
Environment
- Operating System: Windows 10, Windows 11, WSL2, Linux (Raspberry Pi)
- CPU: arm64, amd64
- Browser: All
- .NET Version (TFM): [net8.0]
Thank your for your bug report. When running it via VS on Windows, it gave me these results:
The increased heap size, seems to be caused by the System.Text.Json cache they have internally. I was before noting two things, try to manually install a recent version of System.Text.Json, it might have some caching fixes included they were doing and the second was that they internally have a timing based cache, which might keep things alive for a few seconds. So GC.Collect() might help.
Also when doing the following, it seems to not leak for me:
using Microsoft.Playwright;
Console.WriteLine($"Started process under PID {System.Diagnostics.Process.GetCurrentProcess().Id}");
while (true) {
await CreateScreenshotAsync(1920, 1080);
GC.Collect();
Console.WriteLine($"TotalMemory: {GC.GetTotalMemory(false)}");
}
async Task CreateScreenshotAsync(uint width, uint height)
{
using var playwright = await Playwright.CreateAsync();
await using var browser = await playwright.Chromium.LaunchAsync();
var page = await browser.NewPageAsync();
await page.SetViewportSizeAsync((int)width, (int)height);
await page.GotoAsync("https://playwright.dev/dotnet/");
await page.ScreenshotAsync(new PageScreenshotOptions { Path = "Screenshot.png", Type = ScreenshotType.Png });
Console.WriteLine("Screenshot created");
}
Ideally we are able to create a repro without AspNetCore out of it. Do you observe the same without AspNetCore?
Thank you for getting back to me.
Even when adding GC.Collect() to my ASP.NET Core sample code, the memory slowly increases over time:
However, when running a console app with the following code, I don't see this behavior:
using Microsoft.Playwright;
Console.WriteLine($"Started process under PID {System.Diagnostics.Process.GetCurrentProcess().Id}");
PeriodicTimer timer = new(TimeSpan.FromSeconds(15));
while (await timer.WaitForNextTickAsync(CancellationToken.None))
{
await CreateScreenshotAsync(1920, 1080);
GC.Collect();
Console.WriteLine($"TotalMemory: {GC.GetTotalMemory(false)}");
}
async Task CreateScreenshotAsync(uint width, uint height)
{
using var playwright = await Playwright.CreateAsync();
await using var browser = await playwright.Chromium.LaunchAsync();
var page = await browser.NewPageAsync();
await page.SetViewportSizeAsync((int)width, (int)height);
await page.GotoAsync("https://playwright.dev/dotnet/");
await page.ScreenshotAsync(new PageScreenshotOptions { Path = "Screenshot.png", Type = ScreenshotType.Png });
Console.WriteLine("Screenshot created");
}
In case you're asking why I care: due to the problematic behavior in my ASP.NET Core app (see here) which runs on my Raspberry Pi 4 in a docker compose stack with a memory resource limit of 1 GB, I see OOM exceptions after some time due to the continuous increasing memory ☹️ I can also configure a memory limit of 0.5 or 2 GB, it doesn't matter: after some time, all the memory is used.
So far, I can only mitigate this by configuring a restart policy for the Docker container, i.e. the process is more or less like this:
- Create a container and create screenshots for a while.
- Work, work, work... (with this always use more memory)
- Container crash due to insufficient memory
- Start a new container and return to step 1
So I see the following follow-up questions:
- Why does Playwright behave differently in an ASP.NET Core hosted service?
- What is the source of the growing unmanaged memory?
- How can the
GC.Collectcall be avoided (even though it only seems to slow the memory increase)? Since the .NET garbage collector automatically gets triggered as soon as the memory gets rare, it runs a full GC anyway - but due to the evergrowing unmanaged memory, it cannot reclaim enough memory and the container dies.
We are facing the similar problem with this, we are using the asp.net core and we are keeping doing the screenshot. the memory is pretty slow, our service will continue leak and eat up all the 8G memory for 2 days.
@shuowpro: which version of Playwright are you using? I'm now on 1.45.1 and it looks better over the last two weeks:
Before, it was constantly crashing after some days (each color represents a new container):
I'll close it for now, since this issue is unfortunately not actionable for us. It looks like some bug in ASP.NET whereI recommend filing against them. Thanks for your understanding and happy that it seems resolved!
@mxschmitt ...and the ASP.NET Core guys will argue the same: it looks like some bug in Playwright 😥
@mu88 do you have a reference to their response? I hope to dedicate some time to it later this week or reach out to some more experienced .NET experts in that area.
So happy to find this issue... and good sleuthing @mu88!
We have a container that grows to many gigabytes in size (over a few days), OOMs, is killed, is restarted, ... etc., etc., etc. We failed to find the cause and were completely stumped. And GC.Collect() doesn't work for us either. Not for a moment did it occur to us that it could be Playwright, we assumed it was our code.
Months later, I found this issue, and it suddenly makes sense!
(Environment: playwright invoked periodically in hosted service, .net7 linux container)
@mu88 @lonix1 Did you get the chance to check for any improvements on this general issue on the more recent versions? Thanks :)
@omni-htg: no news from my side since this update
@omni-htg unfortunately we are stuck on that version for now, so I can't comment on recent versions.
How about you... what behaviour are you seeing?
@lonix1 Thanks for reaching out! Currently we are only using it for short-lived integration testing on .NET 8, so it's not the kind of scenario that would match your use case, nor are we seeing egregious consumption -- but I still wanted to consider the heads-up if it was still happening to you both on the more recent versions.
Memory leakage is because Playwright subscribes on events and never unsubscribes: https://github.com/microsoft/playwright-dotnet/blob/64d7b6553527758869c5ee3386c479a14354ff7e/src/Playwright/Playwright.cs#L44
The issue is not in ScreenshotAsync() command, the issue in Playwright class, when it starts and closes many times. Usually I guess users should reuse it, so just associate only one single instance within your background service.
I applied dirty fix locally, and memory usage improved. On the following screenshot you can see before/after (this experiment was running during 2mins, which just ran using var p = await Playwright.StartAsyncc() in a loop). In both cases nodejs was ran ~300 times.
Now I kept the experiment running for a while, and memory usage stabilized at some point (playwright was invoked 1200 times):
Before dirty fix playwright ate 1GB+ and kept eating.
Sharing my dirty fix: https://github.com/nvborisenko/playwright-dotnet/commit/119724fb3c473cb5419ccae10efe9334bc04be86