chrome-ssh-agent icon indicating copy to clipboard operation
chrome-ssh-agent copied to clipboard

Tests occasionally fail with "timeout waiting for Xvfb"

Open ralimi opened this issue 2 years ago • 1 comments

An example run: https://github.com/google/chrome-ssh-agent/runs/7818213408

The error we get is "timeout waiting for Xvfb", and this happens for the end-to-end tests. It happens during initialization of Selenium (https://github.com/tebeka/selenium/blob/e9100b7f5ac11727841302026707e3961ba14712/service.go#L377).

ralimi avatar Aug 13 '22 15:08 ralimi

By hacking around with the Selenium code, I managed to get a failed test with the Xvfb stderr output:

_XSERVTransmkdir: Owner of /tmp/.X11-unix should be set to root

ralimi avatar Aug 14 '22 05:08 ralimi

Another failed run showed: _XSERVTransmkdir: ERROR: euid != 0,directory /tmp/.X11-unix will not be created.

ralimi avatar Aug 14 '22 15:08 ralimi

Okay, closer to the full error now:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //test:test
-----------------------------------------------------------------------------
_XSERVTransmkdir: Owner of /tmp/.X11-unix should be set to root
--- FAIL: TestWebApp (3.03s)
    e2e.go:113: failed to start Selenium service: error starting frame buffer: timeout waiting for Xvfb
    e2e.go:93: Dumping log: SeleniumOutput
FAIL
(EE) 
Fatal server error:
(EE) Cannot write display number to fd 3
(EE) 

It's attempting to write the display number to the file descriptor opened by Selenium (https://github.com/tebeka/selenium/blob/e9100b7f5ac11727841302026707e3961ba14712/service.go#L334), but it fails.

ralimi avatar Aug 14 '22 15:08 ralimi

After experimenting a bit, some more observations:

  1. This seems to happen only (a) after a system reboot, and (b) when the timeout for starting Xvfb is 3 seconds.
  2. Changing either (a) or (b) avoids the issue. For example, retrying without a system reboot works reliably, as does setting the timeout to 10 seconds.

I think the explanation for the above error messages are:

  • Cannot write display number to fd 3: This is happens as a result of Selenium timing out and closing the pipe; Xvfb eventually starts up, but the pipe is closed, and Xvfb writes the error. That is, this is a side effect, but not the root cause.
  • ERROR: euid != 0,directory /tmp/.X11-unix will not be created. and Owner of /tmp/.X11-unix should be set to root: these appear innocuous, and not the actual cause of the issue.

It seems like the proper fix here is to just give it a more generous timeout to handle cases where Xvfb needs more time to startup.

ralimi avatar Aug 14 '22 16:08 ralimi

curious if there's a reason you're using X/xvfb and not chrome in headless mode

vapier avatar Aug 14 '22 18:08 vapier

curious if there's a reason you're using X/xvfb and not chrome in headless mode

Chrome extensions don't work with headless mode. See https://bugs.chromium.org/p/chromium/issues/detail?id=706008

ralimi avatar Aug 14 '22 18:08 ralimi

Chrome extensions don't work with headless mode. See https://bugs.chromium.org/p/chromium/issues/detail?id=706008

is that still accurate ? the last few comments in that bug mention --headless=chome now exists, and you can load extensions in that mode.

vapier avatar Aug 17 '22 10:08 vapier

Ah - interesting. In digging into this earlier, I had found a citation on stackoverflow that it wasn't supported (and wasn't going to be supported). Didn't catch the last comment there when I dug up that issue.

Merged a change to use --headless=chrome and remove use of Xvfb. Thanks!

ralimi avatar Aug 18 '22 03:08 ralimi