gz-gui icon indicating copy to clipboard operation
gz-gui copied to clipboard

Deflake X-display tests on GitHub actions

Open chapulina opened this issue 5 years ago • 8 comments

Many ign-gui tests are failing like this on GitHub actions:

qt.qpa.screen: QXcbConnection: Could not connect to display 
Could not connect to any X display.

We should prevent these tests from running when no display is detected. As a reference, this is how Gazebo-classic detects it: https://github.com/osrf/gazebo/blob/6fd426b3949c4ca73fa126cde68f5cc4a59522eb/cmake/CheckDRIDisplay.cmake

chapulina avatar May 07 '20 00:05 chapulina

We should prevent these tests from running when no display is detected

I would prefer to use a different approach that the one in Gazebo and add have an option in the build/test system to indicate that GUI tests needs to be compiled/executed or not. This way is easier to detect failures when the display is not working well since the build will fail instead of silently report success while hiding errors.

j-rivero avatar Jul 27 '20 13:07 j-rivero

X display tests have been fixed on GitHub actions on #98, but they're still flaky. This is the new error:

  [GUI] [Wrn] [Application.cc:649] [QT] could not connect to display :1.0
  [GUI] [Err] [Application.cc:653] [QT] This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
  
  Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, xcb.

chapulina avatar Feb 22 '21 22:02 chapulina

As a reference, ign-rendering also uses Xvfb and suffers from the same flakiness. Here's an example error message:

   [ RUN      ] Camera/CameraTest.RenderTexture/ogre2
  [Err] [Ogre2RenderEngine.cc:338] Unable to open display: :1.0
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:742] Unable to create the rendering window
  
  
        Start  2: check_UNIT_Camera_TEST
   2/67 Test  #2: check_UNIT_Camera_TEST ................***Failed    0.03 sec

It doesn't always happen to the same test. I've not been able to identify a pattern (i.e. it's always the first test, etc). It looks like the display just can't be found for one test, but then it's found again.

I'm trying out different Xvfb arguments, and also trying to make the failure more verbose. The thing is that this failure isn't very common, so so it's hard to reproduce.


https://github.com/osrf/buildfarmer/issues/161

chapulina avatar Feb 24 '21 00:02 chapulina

Got a new error today that may help debug this a bit more:

   [ RUN      ] Scene3DTest.Events
  [GUI] [Wrn] [Application.cc:657] [QT] The X11 connection broke: Unknown error (code 80)
  XIO:  fatal IO error 2 (No such file or directory) on X server "0��LV"
        after 520 requests (520 known processed) with 0 events remaining.
  [GUI] [Wrn] [Application.cc:657] [QT] QObject::~QObject: Timers cannot be stopped from another thread
  [GUI] [Wrn] [Application.cc:657] [QT] QObject::~QObject: Timers cannot be stopped from another thread

It's possible that Xvfb is being killed due to high memory usage.

chapulina avatar Jun 25 '21 21:06 chapulina

Using EGL may solve this issue.

chapulina avatar Dec 13 '21 20:12 chapulina

have an option in the build/test system to indicate that GUI tests needs to be compiled/executed or not

We could revisit this idea and expose a CMake argument that lets us disable the tests which require a display on GitHub actions, but leave them enabled on Jenkins.

Another alternative that @mjcarroll brought up was to try using one of the other platform plugins suggested in one of the errors above:

Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, xcb.

chapulina avatar Jun 14 '22 17:06 chapulina

Another option potentially? https://github.com/uwerat/qpagbm

mjcarroll avatar Jun 14 '22 21:06 mjcarroll

This should be significantly improved after #419. Won't close this issue until the PR is forward ported and we have some confidence that there are no remaining odd failures on github actions.

Specific failures are to be tracked as this one: https://github.com/gazebosim/gz-gui/issues/421

Blast545 avatar Jun 16 '22 17:06 Blast545