puppeteer-sharp icon indicating copy to clipboard operation
puppeteer-sharp copied to clipboard

LaunchAsync throws WebSocket error on AWS Lambda

Open Coolgatty opened this issue 1 year ago • 17 comments

Im in the process of updating to .NET 8, and after #2784 resolved, now i deployed to lambda runtime dotnet8 but it says the following when doing LaunchAsync

In order of appearance

2024-09-27T23:06:45.861Z	e0bbaeba-e1de-40c7-b78a-b9bfe87ff5cf	info	   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 99
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 105
   at HeadlessChromium.Puppeteer.Lambda.Dotnet.HeadlessChromiumPuppeteerLauncher.LaunchAsync(String[] chromeArgs)
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.StartScraping(Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 92
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.ManageScrapingRetryActions(Func`3 func, Int32 maxRetry, Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 252

e0bbaeba-e1de-40c7-b78a-b9bfe87ff5cf info The WebSocket is in an invalid state ('Aborted') for this operation. Valid states are: 'Open, CloseReceived'

2024-09-27T23:06:45.865Z	e0bbaeba-e1de-40c7-b78a-b9bfe87ff5cf	info	Error executing ScrapingOneClick: Failed to create connection,
 StackTrace:    at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 99
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 105
   at HeadlessChromium.Puppeteer.Lambda.Dotnet.HeadlessChromiumPuppeteerLauncher.LaunchAsync(String[] chromeArgs)
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.StartScraping(Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 92
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.ManageScrapingRetryActions(Func`3 func, Int32 maxRetry, Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 252
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.ManageScrapingRetryActions(Func`3 func, Int32 maxRetry, Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 303
   at OneClick.Pagos.Interno.Core.Services.OneClickService.ScrapingOneClick(Nullable`1 date, Boolean isLambda, Int32 retries, Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Core/Services/OneClickService.cs:line 66
   at OneClick.Pagos.Interno.Lambda.LambdaScrapeOneClick.ScrapeOneClick(LambdaInputScrape InputScrape, ILambdaContext context) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno/Lambda/LambdaScrapeOneClick.cs:line 70,
 InnerException: System.Net.WebSockets.WebSocketException (203): The WebSocket is in an invalid state ('Aborted') for this operation. Valid states are: 'Open, CloseReceived'
   at System.Net.WebSockets.WebSocketValidate.ThrowIfInvalidState(WebSocketState currentState, Boolean isDisposed, WebSocketState[] validStates)
   at System.Net.WebSockets.ManagedWebSocket.SendAsync(ReadOnlyMemory`1 buffer, WebSocketMessageType messageType, WebSocketMessageFlags messageFlags, CancellationToken cancellationToken)
--- End of stack trace from previous location ---
   at PuppeteerSharp.Helpers.TaskQueue.Enqueue(Func`1 taskGenerator) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Helpers/TaskQueue.cs:line 66
   at PuppeteerSharp.Cdp.Connection.SendAsync(String method, Object args, Boolean waitForCallback, CommandOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/Connection.cs:line 137
   at PuppeteerSharp.Cdp.ChromeTargetManager.InitializeAsync() in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/ChromeTargetManager.cs:line 61
   at PuppeteerSharp.Cdp.CdpBrowser.CreateAsync(SupportedBrowser browserToCreate, Connection connection, String[] contextIds, Boolean acceptInsecureCerts, ViewPortOptions defaultViewPort, LauncherBase launcher, Func`2 targetFilter, Func`2 isPageTargetCallback, Action`1 initAction) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/CdpBrowser.cs:line 181
   at PuppeteerSharp.Cdp.CdpBrowser.CreateAsync(SupportedBrowser browserToCreate, Connection connection, String[] contextIds, Boolean acceptInsecureCerts, ViewPortOptions defaultViewPort, LauncherBase launcher, Func`2 targetFilter, Func`2 isPageTargetCallback, Action`1 initAction) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/CdpBrowser.cs:line 187
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 81
2024-09-27T23:06:45.885Z	e0bbaeba-e1de-40c7-b78a-b9bfe87ff5cf	fail	PuppeteerSharp.ProcessException: Failed to create connection
 ---> System.Net.WebSockets.WebSocketException (203): The WebSocket is in an invalid state ('Aborted') for this operation. Valid states are: 'Open, CloseReceived'
   at System.Net.WebSockets.WebSocketValidate.ThrowIfInvalidState(WebSocketState currentState, Boolean isDisposed, WebSocketState[] validStates)
   at System.Net.WebSockets.ManagedWebSocket.SendAsync(ReadOnlyMemory`1 buffer, WebSocketMessageType messageType, WebSocketMessageFlags messageFlags, CancellationToken cancellationToken)
--- End of stack trace from previous location ---
   at PuppeteerSharp.Helpers.TaskQueue.Enqueue(Func`1 taskGenerator) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Helpers/TaskQueue.cs:line 66
   at PuppeteerSharp.Cdp.Connection.SendAsync(String method, Object args, Boolean waitForCallback, CommandOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/Connection.cs:line 137
   at PuppeteerSharp.Cdp.ChromeTargetManager.InitializeAsync() in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/ChromeTargetManager.cs:line 61
   at PuppeteerSharp.Cdp.CdpBrowser.CreateAsync(SupportedBrowser browserToCreate, Connection connection, String[] contextIds, Boolean acceptInsecureCerts, ViewPortOptions defaultViewPort, LauncherBase launcher, Func`2 targetFilter, Func`2 isPageTargetCallback, Action`1 initAction) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/CdpBrowser.cs:line 181
   at PuppeteerSharp.Cdp.CdpBrowser.CreateAsync(SupportedBrowser browserToCreate, Connection connection, String[] contextIds, Boolean acceptInsecureCerts, ViewPortOptions defaultViewPort, LauncherBase launcher, Func`2 targetFilter, Func`2 isPageTargetCallback, Action`1 initAction) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Cdp/CdpBrowser.cs:line 187
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 81
   --- End of inner exception stack trace ---
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 99
   at PuppeteerSharp.Launcher.LaunchAsync(LaunchOptions options) in /home/runner/work/puppeteer-sharp/puppeteer-sharp/lib/PuppeteerSharp/Launcher.cs:line 105
   at HeadlessChromium.Puppeteer.Lambda.Dotnet.HeadlessChromiumPuppeteerLauncher.LaunchAsync(String[] chromeArgs)
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.StartScraping(Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 92
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.ManageScrapingRetryActions(Func`3 func, Int32 maxRetry, Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 252
   at OneClick.Pagos.Interno.Scraping.Services.ScrapingService.ManageScrapingRetryActions(Func`3 func, Int32 maxRetry, Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Scraping/Services/ScrapingService.cs:line 303
   at OneClick.Pagos.Interno.Core.Services.OneClickService.ScrapingOneClick(Nullable`1 date, Boolean isLambda, Int32 retries, Boolean throttled) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno.Core/Services/OneClickService.cs:line 66
   at OneClick.Pagos.Interno.Lambda.LambdaScrapeOneClick.ScrapeOneClick(LambdaInputScrape InputScrape, ILambdaContext context) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno/Lambda/LambdaScrapeOneClick.cs:line 70
   at OneClick.Pagos.Interno.Lambda.LambdaScrapeOneClick.ScrapeOneClick(LambdaInputScrape InputScrape, ILambdaContext context) in /codebuild/output/src614615540/src/bitbucket.org/defontana/one-click/OneClick.Pagos.Interno/Lambda/LambdaScrapeOneClick.cs:line 82
   at lambda_method1(Closure, Stream, ILambdaContext, Stream)
   at Amazon.Lambda.RuntimeSupport.HandlerWrapper.<>c__DisplayClass8_0.<GetHandlerWrapper>b__0(InvocationRequest invocation) in /src/Repo/Libraries/src/Amazon.Lambda.RuntimeSupport/Bootstrap/HandlerWrapper.cs:line 54
   at Amazon.Lambda.RuntimeSupport.LambdaBootstrap.InvokeOnceAsync(CancellationToken cancellationToken) in /src/Repo/Libraries/src/Amazon.Lambda.RuntimeSupport/Bootstrap/LambdaBootstrap.cs:line 185

Code:

        public async Task<(IBrowser browser, IPage page)> StartScraping(bool throttled = false)
        {
            IBrowser browser;
            IPage page;
            (int width, int height) = GetDefaultViewport();

            if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
            {
                // Optimizado para AWS Lambda
                var browserLauncher = new HeadlessChromiumPuppeteerLauncher(_loggerFactory);

                var linuxArgs = new[]
                {
                    "--autoplay-policy=user-gesture-required",
                    "--disable-background-networking",
                    "--disable-background-timer-throttling",
                    "--disable-backgrounding-occluded-windows",
                    "--disable-breakpad",
                    "--disable-client-side-phishing-detection",
                    "--disable-component-update",
                    "--disable-default-apps",
                    "--disable-dev-shm-usage",
                    "--disable-domain-reliability",
                    "--disable-extensions",
                    "--disable-features=AudioServiceOutOfProcess,IsolateOrigins,site-per-process",
                    "--disable-hang-monitor",
                    "--disable-ipc-flooding-protection",
                    "--disable-offer-store-unmasked-wallet-cards",
                    "--disable-popup-blocking",
                    "--disable-print-preview",
                    "--disable-prompt-on-repost",
                    "--disable-renderer-backgrounding",
                    "--disable-setuid-sandbox",
                    "--disable-speech-api",
                    "--disable-sync",
                    "--disable-web-security",
                    "--disk-cache-size=33554432",
                    "--hide-scrollbars",
                    "--ignore-gpu-blocklist",
                    "--metrics-recording-only",
                    "--mute-audio",
                    "--no-default-browser-check",
                    "--no-first-run",
                    "--no-pings",
                    "--no-sandbox",
                    "--no-zygote",
                    "--password-store=basic",
                    "--use-gl=swiftshader",
                    "--use-mock-keychain",
                    "--single-process",
                    "--incognito",
                    $"--window-size={width},{height}"
                };

                browser = await browserLauncher.LaunchAsync(linuxArgs);
            }

Ill attach the full cloudwatch logs:

log-events-viewer-result.csv

Coolgatty avatar Sep 27 '24 23:09 Coolgatty

Is your very same code running on .NET 7 and are you upgrading to .NET 8? Is your app working in production now on .NET 7?

kblok avatar Sep 30 '24 17:09 kblok

Can you try removing the --single-process feature flag?

kblok avatar Oct 01 '24 15:10 kblok

Is your very same code running on .NET 7 and are you upgrading to .NET 8? Is your app working in production now on .NET 7?

i never went through net 7 since amazon does not support net 7 in lambda. So im upgrading from NET 6 to 8. Everything worked just fine in NET 6 in production.

ill try removeing --single-process and see what happens

Coolgatty avatar Oct 01 '24 23:10 Coolgatty

I removed --single-process flag and i get similar errors:

2024-10-02T12:15:20.126Z 49b7394f-37d0-4eb1-88d9-0f0ec204738c info Protocol error (Performance.enable): Session closed. Most likely the Page has been closed.Close reason: The remote party closed the WebSocket connection without completing the close handshake. (The remote party closed the WebSocket connection without completing the close handshake.)

log-events-viewer-result (3).csv

Coolgatty avatar Oct 02 '24 12:10 Coolgatty

ok. Can you try changing the package dependency to this?

<PackageReference Include="PuppeteerSharp" Version="20.0.2">
    <ExcludeAssets>runtime</ExcludeAssets>
  </PackageReference>

kblok avatar Oct 02 '24 13:10 kblok

ok. Can you try changing the package dependency to this?

<PackageReference Include="PuppeteerSharp" Version="20.0.2">
    <ExcludeAssets>runtime</ExcludeAssets>
  </PackageReference>

did it, same error

Coolgatty avatar Oct 02 '24 19:10 Coolgatty

thanks for the update

satviktechie1986 avatar Oct 28 '24 12:10 satviktechie1986

System.Net.WebSockets

can you dig further for above dll used in puppeteersharp lib ? might causing a issue

satviktechie1986 avatar Oct 28 '24 13:10 satviktechie1986

System.Net.WebSockets

can you dig further for above dll used in puppeteersharp lib ? might causing a issue

Yes. I think it's the System.Net.Websockets version in .NET 8. It would be great if someone could find a way to force your app to use the netstandard and see if the netstandard code works. Or try to run the same version in .NET 6 or 7.

kblok avatar Oct 28 '24 13:10 kblok

i have tried by giving multiple target frameworks, but it seems to be not working as expected as many dependenies need to be downgrade, can't you upgrade PuppeteerSharp project to .net 8 ? i know as there are many ghost need to be thown out during update to .net 8 :)

satviktechie1986 avatar Oct 28 '24 14:10 satviktechie1986

i have tried by giving multiple target frameworks, but it seems to be not working as expected as many dependenies need to be downgrade, can't you upgrade PuppeteerSharp project to .net 8 ? i know as there are many ghost need to be thown out during update to .net 8 :)

PuppeteerSharp is multi-target. It ships netstandard2.0 and .NET 8.

kblok avatar Oct 28 '24 14:10 kblok

let me try out by downgrading the projects dependencies and giving multiple target frameworks , hope it works

satviktechie1986 avatar Oct 28 '24 14:10 satviktechie1986

fixed by below code ..

string[] args = { "--disable-setuid-sandbox", "--disable-dev-shm-usage", "--no-sandbox", "--single-process"};

var launchOptions = new LaunchOptions() { ExecutablePath = chromeLocation, Args = args, Headless = true, //Timeout = 0, WebSocketFactory = async (uri, socketOptions, cancellationToken) => { var client = SystemClientWebSocket.CreateClientWebSocket(); if (client is System.Net.WebSockets.Managed.ClientWebSocket managed) { managed.Options.KeepAliveInterval = TimeSpan.FromSeconds(0); await managed.ConnectAsync(uri, cancellationToken); } else { var coreSocket = client as ClientWebSocket; coreSocket.Options.KeepAliveInterval = TimeSpan.FromSeconds(0);

         context.Logger.LogInformation("uri --> " + uri);
         try
         {
             
             await coreSocket.ConnectAsync(uri, cancellationToken).ConfigureAwait(false);

             context.Logger.LogInformation("uri connected --> " + uri);
         }
         catch (Exception ex1)
         {
             context.Logger.LogInformation("error  --> " + ex1);
             
         }

         
     }

     return client;
 },

};

var browser = await new Launcher(new LoggerFactory().AddSerilog(Log.Logger)).LaunchAsync(launchOptions)

satviktechie1986 avatar Oct 30 '24 07:10 satviktechie1986

@Coolgatty, Do you want to try that out?

kblok avatar Oct 30 '24 13:10 kblok

fixed by below code ..

string[] args = { "--disable-setuid-sandbox", "--disable-dev-shm-usage", "--no-sandbox", "--single-process"};

var launchOptions = new LaunchOptions() { ExecutablePath = chromeLocation, Args = args, Headless = true, //Timeout = 0, WebSocketFactory = async (uri, socketOptions, cancellationToken) => { var client = SystemClientWebSocket.CreateClientWebSocket(); if (client is System.Net.WebSockets.Managed.ClientWebSocket managed) { managed.Options.KeepAliveInterval = TimeSpan.FromSeconds(0); await managed.ConnectAsync(uri, cancellationToken); } else { var coreSocket = client as ClientWebSocket; coreSocket.Options.KeepAliveInterval = TimeSpan.FromSeconds(0);

         context.Logger.LogInformation("uri --> " + uri);
         try
         {
             
             await coreSocket.ConnectAsync(uri, cancellationToken).ConfigureAwait(false);

             context.Logger.LogInformation("uri connected --> " + uri);
         }
         catch (Exception ex1)
         {
             context.Logger.LogInformation("error  --> " + ex1);
             
         }

         
     }

     return client;
 },

};

var browser = await new Launcher(new LoggerFactory().AddSerilog(Log.Logger)).LaunchAsync(launchOptions)

could you help me with the imports? im not sure what i have to import to make that work

Coolgatty avatar Nov 04 '24 17:11 Coolgatty

Thanks @satviktechie1986 and @kblok i got my scraper running on .NET 8 in AWS Lambda without having to use a layer by using the following code:


public async Task<(IBrowser browser, IPage page)> StartScraping(bool throttled = false)
        {
            IBrowser browser;
            IPage page;
            (int width, int height) = GetDefaultViewport();

            if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
            {

                string[] args = [
                    "--disable-setuid-sandbox",
                    "--disable-dev-shm-usage",
                    "--no-sandbox",
                    "--single-process"
                ];

                var chromeLocation = new ChromiumExtractor(_loggerFactory).ExtractChromium();
                var launchOptions = new LaunchOptions()
                {
                    ExecutablePath = chromeLocation,
                    Args = args,
                    Headless = true,
                    //Timeout = 0,
                    WebSocketFactory = async (uri, socketOptions, cancellationToken) =>
                    {
                        var client = new ClientWebSocket();
                        client.Options.KeepAliveInterval = TimeSpan.FromSeconds(0);

                        Console.WriteLine("uri --> " + uri);
                        try
                        {
                            await client.ConnectAsync(uri, cancellationToken).ConfigureAwait(false);
                            Console.WriteLine("uri connected --> " + uri);
                        }
                        catch (Exception ex1)
                        {
                            Console.WriteLine("error  --> " + ex1);
                        }

                        return client;
                    },
                };
                browser = await new Launcher(_loggerFactory).LaunchAsync(launchOptions);
            }

Coolgatty avatar Nov 04 '24 17:11 Coolgatty

We had a lot of issues running chrome in lambda and we ended up making nuget package which downloads chome in build/publish. Also this nuget should work without any additional library installed on system, which is a big plus, you can use whatever base image you want.

https://github.com/madcoons/chrome-for-testing-nuget

miroljub1995 avatar Jan 22 '25 01:01 miroljub1995