Daemon icon indicating copy to clipboard operation
Daemon copied to clipboard

“could not load OpenGL subsystem” bug with Nvidia when running a non-release build

Open illwieckz opened this issue 3 years ago • 12 comments

I usually build the game on my workstation running Ubuntu 20.04 (no Nvidia stuff on it, at all), then run the game on various hosts with an Ubuntu 20.04 running on an USB key. Usually, I rsync my build and run it, but while testing 473 I went into troubles…

On the USB key, running Nvidia driver, I got this:

`^3Warn: GLW_StartOpenGL() - could not load OpenGL subsystem: Missing GL version

One thing to notice is that, if I'm right, I was running the old 340 driver on old 5.4 kernel.

I then rebuilt the game from the system on the USB key, got the same error.

At the same time the 0.52 release build works on this system.

illwieckz avatar May 31 '21 17:05 illwieckz

So, the same custom non-release build, on the same computer (with Quadro K1100M), on the same system, I get the error with the 340 driver but with the 390 driver:

Nvidia → 340.108 390.143
Linux 5.4.0 error
Linux 5.8.0 error works

But with the release build, built in docker, it always work:

Nvidia → 340.108 390.143
Linux 5.4.0 works
Linux 5.8.0 works works

illwieckz avatar Jun 02 '21 23:06 illwieckz

Hmm, the 340 driver is the buggy one we implemented a detection to disable an optional feature to workaround a driver bug, see #340 and #370.

illwieckz avatar Jun 03 '21 01:06 illwieckz

And we read the version to do the detection…

illwieckz avatar Jun 03 '21 01:06 illwieckz

An ldd daemon on the self-built binary binary shows libGLdispatche.so.0 but not with the release build.

illwieckz avatar Jun 04 '21 17:06 illwieckz

With:

cmake -DOpenGL_GL_PREFERENCE=LEGACY ..

and calling make, it relinks daemon and I can run it.

If this variable is unset or set to GLVND, I cannot run the game on 340 driver.

illwieckz avatar Jun 04 '21 18:06 illwieckz

From: https://cmake.org/cmake/help/latest/module/FindOpenGL.html#linux-specific

Some Linux systems utilize GLVND as a new ABI for OpenGL. GLVND separates context libraries from OpenGL itself; OpenGL lives in "libOpenGL", and contexts are defined in "libGLX" or "libEGL". GLVND is currently the only way to get OpenGL 3+ functionality via EGL in a manner portable across vendors. Projects may use GLVND explicitly with target OpenGL::OpenGL and either OpenGL::GLX or OpenGL::EGL.

Projects may use the OpenGL::GL target (or OPENGL_LIBRARIES variable) to use legacy GL interfaces. These will use the legacy GL library located by OPENGL_gl_LIBRARY, if available. If OPENGL_gl_LIBRARY is empty or not found and GLVND is available, the OpenGL::GL target will use GLVND OpenGL::OpenGL and OpenGL::GLX (and the OPENGL_LIBRARIES variable will use the corresponding libraries). Thus, for non-EGL-based Linux targets, the OpenGL::GL target is most portable.

The OpenGL target was recently changed from ${OPENGL_LIBRARIES} to OpenGL::GL in c1c5d592bcdca90c070eae2bd54f4be653a546fd. Maybe that's related?

We would have to test the game on Wayland with AMD or Intel (or Nvidia with nouveau) and standard GBM, and to test the game on Wayland with Nvidia's non-standard EGLStream.

illwieckz avatar Jun 04 '21 18:06 illwieckz

Reverting to ${OPENGL_LIBRARIES} instead of OpenGL::GL and not using OpenGL_GL_PREFERENCE=LEGACY leads to a segfault, while using OpenGL::GL without using OpenGL_GL_PREFERENCE=LEGACY makes the game not load but handles the error and display a message instead.

illwieckz avatar Jun 04 '21 18:06 illwieckz

Another way to produce this error message (could not load OpenGL subsystem: Missing GL version) is to set r_glMajorVersion 9

slipher avatar Jun 07 '21 01:06 slipher

Yes, but while the message is the same, that is probably for another reason. With current code, it will try to load inexistent GL 9 then fail to create the context because GL 9 does not exist.

But on those old drivers without the LEGACY cmake option, it looks like the engine is able to create a GL 3.2 context before failing…

With my gldetect branch (see #478), I can see that:

Debug: SDL borderless window created at 0,0 with 1920×1080 size
Debug: Valid context: 16-bit GL 2.1 compat
Debug: Valid context: 24-bit GL 2.1 compat
Debug: Invalid context: 16-bit GL 2.2 compat
Debug: Invalid context: 24-bit GL 2.2 compat
Debug: Valid context: 16-bit GL 3.0 compat
Debug: Valid context: 24-bit GL 3.0 compat
Debug: Valid context: 16-bit GL 3.1 compat
Debug: Valid context: 24-bit GL 3.1 compat
Debug: Valid context: 16-bit GL 3.2 core
Debug: Valid context: 24-bit GL 3.2 core
Debug: Valid context: 16-bit GL 3.3 core
Debug: Valid context: 24-bit GL 3.3 core
Debug: Invalid context: 16-bit GL 3.4 core
Debug: Invalid context: 24-bit GL 3.4 core
Debug: Valid context: 16-bit GL 4.0 core
Debug: Valid context: 24-bit GL 4.0 core
Debug: Valid context: 16-bit GL 4.1 core
Debug: Valid context: 24-bit GL 4.1 core
Debug: Valid context: 16-bit GL 4.2 core
Debug: Valid context: 24-bit GL 4.2 core
Debug: Valid context: 16-bit GL 4.3 core
Debug: Valid context: 24-bit GL 4.3 core
Debug: Valid context: 16-bit GL 4.4 core
Debug: Valid context: 24-bit GL 4.4 core
Debug: Invalid context: 16-bit GL 4.5 core
Debug: Invalid context: 24-bit GL 4.5 core
Debug: Invalid context: 16-bit GL 5.0 core
Debug: Invalid context: 24-bit GL 5.0 core
Best context: 24-bit GL 4.4 core
Debug: Created best context: 24-bit GL 4.4 core
Using 24 Color bits, 0 depth, 8 stencil display.
Debug: Destroying 1920×1080 SDL window at 0,0
Warn: GLimp_SetMode: could not load OpenGL subsystem: Missing GL version

The engine managed to create GL contexts from 2.1 to 4.4 properly (the maximum this drivers support for that hardware), then failed…

illwieckz avatar Jun 07 '21 01:06 illwieckz

The failure is on GLEW side:

	glewResult = glewInit();

	if ( glewResult != GLEW_OK )
	{
		// glewInit failed, something is seriously wrong
		GLimp_DestroyWindow();
		Sys::Error( "GLimp_SetMode: could not load OpenGL subsystem: %s", glewGetErrorString( glewResult ) );
	}
	else
	{
		logger.Notice("Using GLEW %s", glewGetString( GLEW_VERSION ) );
	}

Edit: on this computer, GLEW is provided by the distro.

illwieckz avatar Jun 07 '21 01:06 illwieckz

With my gldetect branch, this test can't be reached if no GL context can be properly loaded (it aborts before). So the error message on my branch can even print that:

GLEW initialization failed: Missing GL version.

Engine successfully created 24-bit GL 4.4 core context,
This is a GLEW issue.

illwieckz avatar Jun 07 '21 02:06 illwieckz

I believe https://github.com/DaemonEngine/Daemon/pull/483 may fix this. Not sure.

necessarily-equal avatar Jun 11 '21 17:06 necessarily-equal