open-gpu-kernel-modules icon indicating copy to clipboard operation
open-gpu-kernel-modules copied to clipboard

eglSwapBuffers failed with 0x300d; app windows do not render

Open FineWolf opened this issue 10 months ago • 3 comments

NVIDIA Open GPU Kernel Modules Version

570.86.16-2

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • [ ] I confirm that this does not happen with the proprietary driver package.
  • Author Note: I did not have time to swap my setup to the proprietary driver package to test; if someone could try to replicate with the proprietary package, that would be great.

Operating System and Version

Arch Linux

Kernel Release

6.13.2-zen1-1-zen

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • [x] I am running on a stable kernel release.

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-0fc352e5-7f1b-7fd6-1d4a-7c06358cbf69)

Describe the bug

Since updating to 570.86.16, a lot of apps are reporting eglSwapBuffers EGL_BAD_SURFACE errors.

This can be easily replicated with krunner. Triggering it once works, but any subsequent launches causes the error and prevents the window from being displayed.

KDE and QT were notified, the consensus seems to be a driver bug.

The issue happened with 565.x as well, but it was intermittent. Now it can be reproduced every single time (thus pointing towards a driver issue).

To Reproduce

  • Open Krunner window
  • Close Krunner window
  • Try to open Krunner window, it will fail to display

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

Relevant package versions:

egl-wayland 4:1.1.17-1
lib32-nvidia-utils 570.86.16-1
libva-nvidia-driver 0.0.13-1
linux-firmware 20250109.7673dffd-1
linux-zen 6.13.2.zen1-1
nvidia-open-dkms 570.86.16-2

FineWolf avatar Feb 14 '25 16:02 FineWolf

I'd like to chime in to the discussion and add to @FineWolf 's report.

I have the same issue. And I can confirm that it also occurs with the proprietary driver package. Same version, too:

nvidia 570.86.16.

I do run a slightly different GPU: GPU 0: NVIDIA GeForce RTX 4080 (UUID: GPU-763bc430-a45a-35b7-bdee-6df66003984f)

I'm also going to attach my nvidia-bug-report for good measure. nvidia-bug-report.log.gz

MemoriesIn8bit avatar Feb 21 '25 09:02 MemoriesIn8bit

I can confirm this. Here is a stacktrace of krunner triggering the bug. One would only need to activate the window twice.

Using the proprietary driver.

Thread 14 "QSGRenderThread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb67ff6c0 (LWP 337947)]
0x00007fffed249315 in ?? () from /usr/lib/libGLX_nvidia.so.0
(gdb) bt
#0  0x00007fffed249315 in ??? () at /usr/lib/libGLX_nvidia.so.0
#1  0x00007fffe66b9493 in ??? () at /usr/lib/libnvidia-glcore.so.570.86.16
#2  0x00007fffed26a33e in ??? () at /usr/lib/libGLX_nvidia.so.0
#3  0x00007fffed238500 in ??? () at /usr/lib/libGLX_nvidia.so.0
#4  0x00007fffed993c76 in QGLXContext::swapBuffers (this=0x7fffac002590, surface=0x555555995cb0)
    at /usr/src/debug/qt6-base/qtbase/src/plugins/platforms/xcb/gl_integrations/xcb_glx/qglxintegration.cpp:548
#5  0x00007ffff6314cbd in QRhiGles2::endFrame (this=0x7fffac001920, swapChain=0x7fffac3d94f0, flags=...) at /usr/src/debug/qt6-base/qtbase/src/gui/rhi/qrhigles2.cpp:2166
#6  0x00007ffff61ab3ce in QRhi::endFrame (this=0x7fffac001900, swapChain=0x7fffac3d94f0, flags=..., flags@entry=...) at /usr/src/debug/qt6-base/qtbase/src/gui/rhi/qrhi.cpp:10870
#7  0x00007ffff7a45a5a in QSGRenderThread::syncAndRender (this=<optimized out>) at /usr/include/qt6/QtCore/qflags.h:73
#8  QSGRenderThread::run (this=0x555555e6f490) at /usr/src/debug/qt6-declarative/qtdeclarative/src/quick/scenegraph/qsgthreadedrenderloop.cpp:975
#9  0x00007ffff58d8a9b in operator() (__closure=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/thread/qthread_unix.cpp:375
#10 (anonymous namespace)::terminate_on_exception<QThreadPrivate::start(void*)::<lambda()> > (t=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/thread/qthread_unix.cpp:311
#11 QThreadPrivate::start (arg=0x555555e6f490) at /usr/src/debug/qt6-base/qtbase/src/corelib/thread/qthread_unix.cpp:339
#12 0x00007ffff50a370a in start_thread (arg=<optimized out>) at pthread_create.c:448
#13 0x00007ffff5127aac in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

nvidia-bug-report.log.gz

The previous version 565 is also affected.

marscher avatar Feb 26 '25 11:02 marscher

I confirm that the driver 570.124.04 is affected. GPU: RTX 4070 Kernel: 6.13.5-arch1-1

nvidia-bug-report.log.gz

arroyo-pl avatar Mar 04 '25 11:03 arroyo-pl