accessibility-insights-action icon indicating copy to clipboard operation
accessibility-insights-action copied to clipboard

ADO extension fails when run in a container: handleRequestFunction failed, { message: 'read ECONNRESET' }

Open katydecorah opened this issue 1 year ago • 8 comments

Describe the bug

When run in a container, the ADO extension fails with:

 handleRequestFunction failed, reclaiming failed request back to the list or queue

It will sometimes include additional error messages, such as:

  • ProtocolError: Protocol error (Target.setAutoAttach): Target closed.
  • caused by: Error: "{ message: 'read ECONNRESET' }"

To Reproduce

You can reproduce the error by running the extension in a container. https://learn.microsoft.com/en-us/azure/devops/pipelines/process/container-phases?view=azure-devops#single-job

pool:
  vmImage: 'windows-2019'

container: mcr.microsoft.com/windows/servercore:ltsc2019

steps:
- script: set

- task: NodeTool@0
  inputs:
      versionSpec: '16.x'

- script: npm install [email protected] -g
  displayName: install yarn as a global dependency
    
- task: accessibility-insights.prod.task.accessibility-insights@3
  displayName: '[should fail] staging accessibility insights task against URL with known failures'
  inputs:
    url: 'https://www.washington.edu/accesscomputing/AU/before.html'

Repro example

  • https://dev.azure.com/accessibility-insights-private/Accessibility%20Insights%20(private)/_build/results?buildId=42790&view=logs&j=12f1170f-54f2-53f3-20dd-22fc7dff55f9&t=7384d774-f7ca-599c-ee57-ab2c05be9247
  • https://dev.azure.com/accessibility-insights-private/Accessibility%20Insights%20(private)/_build/results?buildId=42788&view=logs&j=6b902995-b73d-5f5c-66fd-a7f66c857d2c&t=ff9c4bdf-81d4-57c3-e146-7d6c292d3dad

Expected behavior

According to Running Puppeteer in Docker, Troubleshooting there are likely extra steps we should recommend to the user.

It's not clear to me if there's anything we can do on the extension side or if this can be helped with added documentation

Screenshots

Context (please complete the following information)

  • OS Name & Version:
  • Azure DevOps Extension Version & Environment:
  • Browser Version:
  • Target Page:

Are you willing to submit a PR?

Yes

Did you search for similar existing issues?

Additional context

katydecorah avatar Mar 03 '23 20:03 katydecorah

This issue has been marked as ready for team triage; we will triage it in our weekly review and update the issue. Thank you for contributing to Accessibility Insights!

If the issue is that the container is missing system dependencies required to successfully run Chromium, this may be yet another reason that it would be nice to switch the service from Puppeteer to Playwright. Playwright gives a much nicer error message for that case (noting before trying to start that the system is missing required dependencies, enumerating which ones, and offering an npx playwright install-deps chromium command to install them for you).

dbjorge avatar Mar 03 '23 21:03 dbjorge

Investigated this a bit further and confirmed that this does seem to be a result of trying to run in Windows Server Core without having installed the prerequisite Server-Media-Foundation Windows Feature required to run Chromium in that Windows SKU. This is true irrespective of whether you're using Server Core via a container or host VM, but in practice we usually see host VMs running full Server SKUs and containers based on something like mcr.microsoft.com/windows/servercore:ltsc2022 or mcr.microsoft.com/windows/servercore:ltsc2019, so it tends to be correlated with users trying to use containers.

On servercore:ltsc2019 it's possible to install the prerequisite windows feature with a step along the lines of Install-WindowsFeature Server-Media-Foundation, but this doesn't work on servercore:ltsc2022 without a much more involved workaround.

Other errors with similar symptoms:

  • If you try to use a linux container without the required prerequisites installed (eg, mcr.microsoft.com/devcontainers/javascript-node:16), you're likely to get an error along these lines:
ERROR PuppeteerCrawler: handleRequestFunction failed, reclaiming failed request back to the list or queue {"url":"[https://site-under-test.com","retryCount":1,"id":"mCnLT4bz8fgQgev"}](https://site-under-test.com%22%2C%22retrycount%22:1%2C%22id%22:"mCnLT4bz8fgQgev"}/)
  Error: Failed to launch the browser process!
  /home/node/.cache/puppeteer/chrome/linux-1108766/chrome-linux/chrome: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory
  • If you try to use a linux container that does have the required prerequisites installed (eg, mcr.microsoft.com/playwright:v1.33.0-jammy), you're likely to get an error along these lines:
ERROR PuppeteerCrawler: handleRequestFunction failed, reclaiming failed request back to the list or queue {"url":"[http://localhost:5858","retryCount":1,"id":"hgW4ugjCDUL55FU"}](http://localhost:5858%22%2C%22retrycount%22:1%2C%22id%22:"hgW4ugjCDUL55FU"}/)
  Error: Failed to launch the browser process!
  [0429/002056.263116:FATAL:zygote_host_impl_linux.cc(127)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/main/docs/linux/suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.

The latter case (where the correct deps are installed on Linux) happens because Chromium's sandboxing only works in Docker containers if you pass a specific Docker seccomp policy while running docker commands, and Azure Pipelines does not pass this by default. There's no workaround that I'm aware of except disabling Chromium's sandboxing, which would be pretty questionable for us to do from a crawler.


In summary, I think our short-term recommendation we should document needs to be that, in order to run the task, users must run it from one of the following environments:

  • A host VM (not a docker container) which is based on a Windows Server SKU or a Linux image which is tested to work with Chromium, for example any of the standard Azure Pipelines Hosted Agent images or standard 1ES PT images.
  • A Windows (not linux) Docker container which is based on a Windows or Windows Server container base image (not servercore or nanoserver)

...noting specifically that Docker containers based on linux, servercore, or nanoserver are not supported.

Medium-term, it would be good to detect this case early and improve our error messaging. On Windows, we could do something comparable to Playwright's install_media_pack.ps1 script (but with Get-WindowsFeature instead of Install-WindowsFeature) to detect whether we're on a server SKU without the required prerequisite.

If we get substantial user requests for linux support, I'm not sure we have a great solution; we could add some sort of chromiumSandbox: false task input to let users opt into disabling sandboxing to enable running on containers with prerequisites installed, but I'm not convinced we should allow users to do that, it carries risks that users are likely to dismiss.

dbjorge avatar Apr 29 '23 01:04 dbjorge

Updated internal documentation per above comment's short-term recommendations.

dbjorge avatar May 01 '23 18:05 dbjorge

This issue has been marked as ready for team triage; we will triage it in our weekly review and update the issue. Thank you for contributing to Accessibility Insights!

The remaining work for this is the paragraph starting with "Medium-term" in the above comment, discussing making the error message more clear. Marking as ready for triage for us to discuss if/when to do that.

dbjorge avatar May 22 '23 19:05 dbjorge

We want to do this if it's fairly easy

DaveTryon avatar Jun 01 '23 18:06 DaveTryon

Investigated this a bit further and confirmed that this does seem to be a result of trying to run in Windows Server Core without having installed the prerequisite Server-Media-Foundation Windows Feature required to run Chromium in that Windows SKU.

Apparently missing Server-Media-Foundation Windows feature has nothing to do with this issue. To run Chrome or Chromium in Windows Server container there is no requirement to install it.

To successfully run Chrome or Chromium in Windows/Linux container install required Fonts.

In Ubuntu you can install required fonts using following script:

apt-get update \
&& apt-get install -y wget gnupg \
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1 libxtst6 procps \
--no-install-recommends \

In Windows Server core enable following optional Windows features and register fonts using Win32 GDI API function. Detailed solution can be found here.

[DllImport("gdi32.dll")]
static extern int AddFontResource(string lpFilename);
ServerCoreFonts-NonCritical-Fonts-MinConsoleFonts
ServerCoreFonts-NonCritical-Fonts-Support
ServerCoreFonts-NonCritical-Fonts-BitmapFonts
ServerCoreFonts-NonCritical-Fonts-TrueType
ServerCoreFonts-NonCritical-Fonts-UAPFonts

lamaks avatar Sep 08 '23 00:09 lamaks