egui icon indicating copy to clipboard operation
egui copied to clipboard

eframe X11 crash after using thread spawn

Open griffi-gh opened this issue 2 years ago • 28 comments

Describe the bug Random crash 1 out of 4 times when spawning a thread with thread::spawn on X11

[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
rustyboi: ../../src/xcb_io.c:269: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.

To Reproduce Steps to reproduce the behavior:

  • Create a simple eframe app
  • Build it, it shoul start just fine
  • Add thread::spawn with loop{}
  • Build it again, it should start randomly crashing on start

Expected behavior N/A

Screenshots изображение

Desktop (please complete the following information):

  • OS: Pop OS + KDE Plasma X11 (gdm3)
  • Browser: N/A

griffi-gh avatar Mar 12 '22 12:03 griffi-gh

Same here

image I am only using the code in the eframe example

OS: Arco Linux X11

Vaimer9 avatar Mar 13 '22 13:03 Vaimer9

I think this is a problem that should be reported to winit

emilk avatar Mar 13 '22 20:03 emilk

Have same on Arch Linux x64 (Manjaro), X11, Intel HD 3000, mesa 21.3.7-2. This error is there 8 months as minimum.

uniconductive avatar Mar 17 '22 05:03 uniconductive

Anyone wants to dig into this? Likely culprits include glutin, glow, winit, or a combination of the three.

emilk avatar Mar 19 '22 12:03 emilk

I have this in Cargo.toml: eframe = { version = "0.17.0", default-features = false, features = ["default_fonts", "egui_glow"] } I see no way to set glutin...

uniconductive avatar Mar 19 '22 15:03 uniconductive

same issue. i suspect there's an issue somewhere inside glutin, i tried (ubuntu/latest mesa driver [from obiaf repo]) using both fltk and sdl with glow... even winit with egui wgpu backend, everything's working fine. see: https://github.com/rust-windowing/glutin/issues/1034

ar37-rs avatar Mar 19 '22 15:03 ar37-rs

Just as a data point I just upgraded to a much faster machine (from circa 2009 4 core i7 to 16 core ryzen), and my egui app does this every time. I have yet to see it run :-( On the slower machine, it never happened.

ambihelical avatar Apr 05 '22 04:04 ambihelical

Seems like a lot of people have problem with this, yet nobody has tried to reproduce it using pure glutin and/or glow without eframe?

Try https://github.com/grovesNL/glow/tree/main/examples/hello with the glutin backend and spawn a thread and see if you can reproduce. Maybe try RUST_BACKTRACE=1 and see if you get a callstack.

emilk avatar Apr 05 '22 06:04 emilk

I can't test https://github.com/grovesNL/glow/tree/main/examples/hello : thread 'main' panicked at '0:1(10): error: GLSL 4.10 is not supported. Supported versions are: 1.10, 1.20, 1.30, 1.40, 1.50, 3.30, 1.00 ES, and 3.00 ES.

But I have same error on https://github.com/emilk/egui/blob/0.17.0/egui_glow/examples/pure_glow.rs and https://github.com/emilk/egui/blob/0.17.0/egui_glium/examples/pure_glium.rs

uniconductive avatar Apr 05 '22 07:04 uniconductive

I tried glow hello w/glutin backend. The same error occurs. There is no backtrace, just core dump.

I was able to get something of a backtrace from the core dump:

Thread 1 (Thread 0x7f1fd4ff2800 (LWP 24822)): #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=139774694205440) at pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=139774694205440) at pthread_kill.c:80 #2 __GI___pthread_kill (threadid=139774694205440, signo=signo@entry=6) at pthread_kill.c:91 #3 0x00007f1fd5037476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 --Type <RET> for more, q to quit, c to continue without paging-- #4 0x00007f1fd501d7b7 in __GI_abort () at abort.c:79 #5 0x00007f1fd501d6db in __assert_fail_base (fmt=0x7f1fd51d1770 "%s%s%s:%u: %s%sAssertion %s' failed.\n%n", assertion=0x7f1fd4f549e8 "!xcb_xlib_threads_sequence_lost", file=0x7f1fd4f54620 "../../src/xcb_io.c", line=269, function=) at assert.c:92 #6 0x00007f1fd502ee26 in __GI___assert_fail (assertion=0x7f1fd4f549e8 "!xcb_xlib_threads_sequence_lost", file=0x7f1fd4f54620 "../../src/xcb_io.c", line=269, function=0x7f1fd4f54fd8 "poll_for_event") at assert.c:101 #7 0x00007f1fd4ee1c6b in ?? () from /lib/x86_64-linux-gnu/libX11.so.6 #8 0x00007f1fd4ee1d0e in ?? () from /lib/x86_64-linux-gnu/libX11.so.6 #9 0x00007f1fd4ee4de2 in _XEventsQueued () from /lib/x86_64-linux-gnu/libX11.so.6 #10 0x00007f1fd4ee50cd in _XGetRequest () from /lib/x86_64-linux-gnu/libX11.so.6 #11 0x00007f1fd4ebec62 in XCreateGC () from /lib/x86_64-linux-gnu/libX11.so.6 #12 0x00007f1fd46c8be8 in ?? () from /lib/x86_64-linux-gnu/libGLX_mesa.so.0 #13 0x00007f1fd46d493f in ?? () from /lib/x86_64-linux-gnu/libGLX_mesa.so.0 #14 0x00007f1fd46d4ad0 in ?? () from /lib/x86_64-linux-gnu/libGLX_mesa.so.0 #15 0x00007f1fd46ce213 in ?? () from /lib/x86_64-linux-gnu/libGLX_mesa.so.0 #16 0x00007f1fd4724473 in ?? () from /lib/x86_64-linux-gnu/libGLX.so.0 #17 0x00007f1fd4726df9 in ?? () from /lib/x86_64-linux-gnu/libGLX.so.0 #18 0x00007f1fd4729dfd in ?? () from /lib/x86_64-linux-gnu/libGLX.so.0 #19 0x000055a9ff75cc73 in glutin_glx_sys::glx::Glx::MakeCurrent (self=0x55a9ffbe6320 <<glutin::api::glx::GLX as core::ops::deref::Deref>::deref::__stability::LAZY>, dpy=0x55aa0156d860, drawable=35651588, ctx=0x55aa015c2de0) at /home/eric/extern/glow/target/debug/build/glutin_glx_sys-d28ff4d8dc23a940/out/glx_bindings.rs:545 #20 0x000055a9ff75c3a3 in glutin::api::glx::make_current_guard::MakeCurrentGuard::new (xconn=0x7ffca70c26c8, drawable=35651588, context=0x55aa015c2de0) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/api/glx/make_current_guard.rs:42 #21 0x000055a9ff7692e5 in glutin::api::glx::ContextPrototype::finish (self=..., window=35651588) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/api/glx/mod.rs:418 #22 0x000055a9ff708d54 in glutin::platform_impl::platform_impl::x11::Context::new_impl<()> (wb=..., el=0x55aa015afab0, pf_reqs=0x7ffca70c4030, gl_attr=0x7ffca70c3c88, fallback=false) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/platform_impl/unix/x11.rs:514 #23 0x000055a9ff70842d in glutin::platform_impl::platform_impl::x11::{impl#6}::new::{closure#0}<()> (fallback=false) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/platform_impl/unix/x11.rs:456 #24 0x000055a9ff707028 in glutin::platform_impl::platform_impl::x11::Context::try_then_fallback<glutin::platform_impl::platform_impl::x11::{impl#6}::new::{closure#0}, (winit::window::Window, glutin::platform_impl::platform_impl::x11::Context)> (f=<error reading variable: Cannot access memory at address 0x6>) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/platform_impl/unix/x11.rs:168 #25 0x000055a9ff70837c in glutin::platform_impl::platform_impl::x11::Context::new<()> (wb=..., el=0x55aa015afab0, pf_reqs=0x7ffca70c4030, gl_attr=0x7ffca70c3c88) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/platform_impl/unix/x11.rs:455 #26 0x000055a9ff6e4fdc in glutin::platform_impl::platform_impl::Context::new_windowed<()> (wb=..., el=0x55aa015afab0, pf_reqs=0x7ffca70c4030, gl_attr=0x7ffca70c4078) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/platform_impl/unix/mod.rs:115 #27 0x000055a9ff73cc46 in glutin::ContextBuilderglutin::context::NotCurrent::build_windowed<glutin::context::NotCurrent, ()> (self=..., wb=..., el=0x55aa015afab0) at /home/eric/.local/opt/cargo/registry/src/github.com-1ecc6299db9ec823/glutin-0.24.1/src/windowed.rs:362 #28 0x000055a9ff6fdab4 in hello::main () at examples/hello/src/main.rs:34 `

ambihelical avatar Apr 06 '22 06:04 ambihelical

I should note I didn't modify hello at all, just built debug, and ran it. It ran as expected once out of about 20 runs.

I am using i3. It runs more often if I run it in a subdivided screen so the initial window is smaller. Maybe 1/3 of the time it works this way.

ambihelical avatar Apr 07 '22 04:04 ambihelical

I am using a 3rd generation pentium, just putting it there for context

Vaimer9 avatar Apr 07 '22 07:04 Vaimer9

@ambihelical please open an issue in the glutin repository!

emilk avatar Apr 07 '22 11:04 emilk

@emilk Hmmm, it turned out my new system did not have correctly set up nvidia drivers and was using software rendering. After I fixed, this I no longer get the crash. Possibly I was seeing a different bug than everyone else? I will continue to monitor things, if it occurs I'll do that.

ambihelical avatar Apr 07 '22 15:04 ambihelical

@emilk the root cause of this problem is that glutin can't tolerate any gpu without vsync support (i have no idea why the reason behind this, intentionally or just a poorly written bug?), and the simple solution to this problem is that just ignore swap_interval only if the gpu has no swap interval extension (vsync) support as mentioned on this PR: https://github.com/rust-windowing/glutin/pull/1387 or we can just simply remove this (needless error message) line inside glutin. but nobody seems to care and not being merged. and here's the proof that this kind of bug can be fixed and this issue can be closed. fixes

ar37-rs avatar Apr 13 '22 10:04 ar37-rs

the PR: https://github.com/rust-windowing/glutin/pull/1387 finally being merged a while ago, waiting for glutin > v0.28.0 release

ar37-rs avatar Apr 13 '22 11:04 ar37-rs

I finally installed an Ubuntu VM and ran into this exact issue ^^ probably because of using a software renderer like @ambihelical suspected.

emilk avatar Apr 13 '22 14:04 emilk

for glutin > v0.28.0 or patched glutin, turn off vsync.

// When compiling natively:
fn main() {
    // Log to stdout (if you run with `RUST_LOG=debug`).
    // tracing_subscriber::fmt::init();

    let mut options = eframe::NativeOptions {
        // Let's show off that we support transparent windows
        transparent: true,
        drag_and_drop_support: true,
        ..Default::default()
    };
	
    // like so
    options.vsync = false;
    
    eframe::run_native(
        "egui demo app",
        options,
        Box::new(|cc| Box::new(egui_demo_lib::WrapApp::new(cc))),
    );
}

the error will be gone.

ar37-rs avatar Apr 13 '22 14:04 ar37-rs

@emilk exactly, even in software render egui runs quite fast.

ar37-rs avatar Apr 13 '22 14:04 ar37-rs

I just turned off vsync for test, but this doubles the CPU load for me. Not really on option therefore.

sourcebox avatar Apr 17 '22 12:04 sourcebox

@sourcebox of course, the main purpose of vsync is... to stabilize the frame rate and it also reduce the CPU consumtion on HW with GPU acceleration, but when it comes down to software render that's different story.

ar37-rs avatar Apr 18 '22 11:04 ar37-rs

@Ar37-rs I do some frame limiting on my own now, but even with vsync disabled I get the error on my old notebook which is only supporting OpenGL ES2. But I have some multithreading going on, so this might be the cause. On my newer notebook which basically the same software configuration, everything is running fine even with vsync enabled.

sourcebox avatar Apr 18 '22 13:04 sourcebox

@sourcebox i'm using mesa OpenGL 4.5 llvmpipe software render > 20.x.x on ubuntu vm intel skylake, even if i use the latest patched glutin from rust-windowing github repo (not the current version of glutin which 0.28.0) my egui app suddenly crashing with vsync enabled, so i have to disable vsync in my case in other to work.

ar37-rs avatar Apr 18 '22 15:04 ar37-rs

Maybe your GPU doesn't support the latest OpenGL version? I just tried looking into this bug by playing with the hello example, and I got the same issue with that example. However, when I added this line below line 35, and ran it using cargo run --features=glutin:

.with_gl(glutin::GlRequest::Specific(glutin::Api::OpenGl, (4, 1)))

I didn't have this issue any more.

I also experienced a similar issue with another app/game, but I was able to fix that issue for myself by telling it what OpenGL version to use when creating the OpenGL context.

Talon1024 avatar May 10 '22 11:05 Talon1024

Is some of your problems solved with https://github.com/emilk/egui/pull/1693 ?

emilk avatar May 28 '22 16:05 emilk

Unfortunately not.

Talon1024 avatar May 29 '22 13:05 Talon1024

i have the same error, using linux mint 20.3 with KDE plasma as the desktop, eframe & egui 0.19.0

NeroNovaMoment avatar Dec 06 '22 02:12 NeroNovaMoment

This shouldn't happen with the wgpu backend (?)

griffi-gh avatar Dec 06 '22 14:12 griffi-gh