x11vnc icon indicating copy to clipboard operation
x11vnc copied to clipboard

caught XIO error: happens much more often when xdamage option is enabled

Open sergiomb2 opened this issue 3 years ago • 20 comments

Describe the bug 10 months ago and since Fedora 32, I started testing one updated x11vnc version from this github project [1], over version 0.9.16 we also added some fixes from master like [2]

[1] https://src.fedoraproject.org/rpms/x11vnc/commits/main https://src.fedoraproject.org/rpms/x11vnc/blob/main/f/x11vnc.spec

[2] https://src.fedoraproject.org/rpms/x11vnc/blob/master/f/x11vnc-0.9.16-src-cursor-fix-xfc-NULL-pointer-dereference.patch

To Reproduce

1.start x11vnc -localhost -display :0 (on host machine with Fedora KDE without kwayland )
2. vncviewer -via host_machine_ip 127.0.0.1 (on guest machine with tigervnc which have -via option) 3. go to system tray , after several and random click 4. caught XIO error 5. do the same but with x11vnc -localhost -display :0 -noxdamage 6. and I don't remember the last time that x11vnc went down

Expected Behavior x11vnc -localhost -display :0 without options just works

Desktop (please complete the following information):

  • OS and version: Fedora 32 KDE
  • Xorg version used: xorg-x11-server-1.20.10
  • Wayland version used: no

Additional context tested with a Nvidia card , I can't find a way to debug the x11vnc crash , gdb backtrace IIRC doesn't give us relevant information , if you know a way to work with gdb I can provide the back traces .

Thanks you

sergiomb2 avatar Dec 19 '20 00:12 sergiomb2

Mhh, maybe we should disable xdamage per default?

bk138 avatar Dec 19 '20 10:12 bk138

May be related https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815909

slavanap avatar May 01 '21 17:05 slavanap

Hi everyone. I've just been bitten by this, and I did a bit of digging into it.

User TL;DR: CTRL+F "workaround", and let everyone know if they help

Developer TL;DR: Insert .ssh/id_ed25519.pub here: [_____________________________] if you'd like to SSH :) - attention span limit reached (ZZzzzz)

Have a full backtrace to begin with. :D

16/03/2022 22:54:58 passing arg to libvncserver: -rfbport
16/03/2022 22:54:58 passing arg to libvncserver: 5910
16/03/2022 22:54:58 x11vnc version: 0.9.16 lastmod: 2019-01-05  pid: 135534
16/03/2022 22:54:58 Using X display :0
16/03/2022 22:54:58 rootwin: 0x110 reswin: 0x5c00001 dpy: 0x996d9ba0
...
...
The VNC desktop is:      debian:10
PORT=5910
16/03/2022 22:55:04 Got connection from client 172.29.202.87
16/03/2022 22:55:04   0 other clients
16/03/2022 22:55:04 Normal socket connection
16/03/2022 22:55:04 incr accepted_client=1 for 172.29.202.87:40886  sock=10
16/03/2022 22:55:04 Client Protocol Version 3.8
16/03/2022 22:55:04 Protocol version sent 3.8, using 3.8
16/03/2022 22:55:04 rfbProcessClientSecurityType: executing handler for type 1
16/03/2022 22:55:04 rfbProcessClientSecurityType: returning securityResult for client rfb version >= 3.8
16/03/2022 22:55:04 copy_tiles: allocating first_line at size 44
16/03/2022 22:55:04 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEC6)
16/03/2022 22:55:04 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0x574D5664)
16/03/2022 22:55:04 Enabling full-color cursor updates for client 172.29.202.87
16/03/2022 22:55:04 Enabling X-style cursor updates for client 172.29.202.87
16/03/2022 22:55:04 Enabling NewFBSize protocol extension for client 172.29.202.87
16/03/2022 22:55:04 Enabling ExtDesktopSize protocol extension for client 172.29.202.87
16/03/2022 22:55:04 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFECD)
16/03/2022 22:55:04 Enabling LastRect protocol extension for client 172.29.202.87
16/03/2022 22:55:04 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xC0A1E5CE)
16/03/2022 22:55:04 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEC7)
16/03/2022 22:55:04 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEC8)
16/03/2022 22:55:04 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEFE)
16/03/2022 22:55:04 Using compression level 2 for client 172.29.202.87
16/03/2022 22:55:04 Using image quality level 8 for client 172.29.202.87
16/03/2022 22:55:04 Using JPEG subsampling 0, Q92 for client 172.29.202.87
16/03/2022 22:55:04 Using tight encoding for client 172.29.202.87
16/03/2022 22:55:04 Sending rfbEncodingExtDesktopSize for size (1366x768) 
16/03/2022 22:55:04 client_set_net: 172.29.202.87  0.0038
16/03/2022 22:55:04 created   xdamage object: 0x5c00002
16/03/2022 22:55:05 client 1 network rate 993.2 KB/sec (42134.2 eff KB/sec)
16/03/2022 22:55:05 client 1 latency:  3.2 ms
16/03/2022 22:55:05 dt1: 0.0113, dt2: 0.0178 dt3: 0.0032 bytes: 27255
16/03/2022 22:55:05 link_rate: LR_LAN - 3 ms, 993 KB/s
caught XIO error:

   *** Welcome to the x11vnc crash shell! ***

PROGRAM: ./src/x11vnc  PID: 135534

POSSIBLE DEBUGGER COMMAND:

  gdb ./src/x11vnc 135534

Press "q" to quit.
Press "h" or "?" for this help.
Press "s" to try to run some commands to show a stack trace (gdb/pstack).

Anything else is passed to -Q query function.


crash> s

running:
        echo where > /tmp/gdb.135534; env PATH=$PATH:/usr/local/bin:/usr/sfw/bin:/usr/bin gdb -x /tmp/gdb.135534 -batch -n ./src/x11vnc 135534; rm -f /tmp/gdb.135534

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f9a20153866 in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=0x55a698533130 <_mysleep>) at ../sysdeps/unix/sysv/linux/select.c:41
41      ../sysdeps/unix/sysv/linux/select.c: No such file or directory.
#0  0x00007f9a20153866 in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=0x55a698533130 <_mysleep>) at ../sysdeps/unix/sysv/linux/select.c:41
#1  0x000055a69818bd90 in crash_shell () at cleanup.c:512
#2  0x000055a69818cbc3 in interrupted (sig=sig@entry=-1) at cleanup.c:570
#3  0x000055a69818d3c9 in XIOerr (d=0x55a6996d9ba0) at cleanup.c:395
#4  0x00007f9a20c92a75 in _XIOError (dpy=0x55a6996d9ba0) at XlibInt.c:1548
#5  0x00007f9a20c8f64e in _XReply (dpy=0x55a6996d9ba0, rep=0x7fff23c93ff0, extra=0, discard=0) at xcb_io.c:797
#6  0x00007f9a20c6c8cd in XGetImage (dpy=0x55a6996d9ba0, d=272, x=0, y=111, width=1366, height=1, plane_mask=18446744073709551615, format=2) at GetImage.c:85
#7  0x00007f9a20c6cbd1 in XGetSubImage (dpy=0xfffffffffffffdfe, d=272, x=0, y=111, width=1366, height=1, plane_mask=18446744073709551615, format=2, dest_image=0x55a6997a6f60, dest_x=0, dest_y=0) at GetImage.c:147
#8  0x000055a69821d4a8 in XGetSubImage_wr (disp=<optimized out>, d=<optimized out>, x=<optimized out>, y=<optimized out>, width=<optimized out>, height=height@entry=1, plane_mask=<optimized out>, format=<optimized out>, dest_image=<optimized out>, dest_x=<optimized out>, dest_y=<optimized out>) at xwrappers.c:335
#9  0x000055a69821e80a in copy_image (dest=0x55a6997a6f60, x=<optimized out>, y=<optimized out>, w=<optimized out>, h=1) at xwrappers.c:865
#10 0x000055a6981bf01d in scan_display (ystart=<optimized out>, rescan=rescan@entry=0) at scan.c:3300
#11 0x000055a6981c3cf9 in scan_for_updates (count_only=count_only@entry=0) at scan.c:3473
#12 0x000055a6981cfbb1 in watch_loop () at screen.c:4703
#13 0x000055a698179dc1 in main (argc=<optimized out>, argv=<optimized out>) at x11vnc.c:6030
[Inferior 1 (process 135534) detached]

running:
        pstack 135534

sh: 1: pstack: not found
crash> q
quitting.
16/03/2022 22:55:17 deleted 43 tile_row polling images.

This backtrace took me a total of 10 minutes to grab (as I can reproduce this instantly), and if you think you're experiencing the same problem I would encourage you that the legwork is very simple and may greatly aid in debugging:

  1. you'll be able to confirm you have exactly the same problem (I even suspect an identifier of this problem is that backtraces will consistently be 99% identical to the one above) and
  2. you may be able to help out testing Ideas™ to further nail this down - I myself crashed into this on a borrowed machine, so while I'll certainly try and join in the fun :) anyone else experiencing this will be able to make a real difference by chiming in with feedback.

I guess the zeroth step is to ensure you can build stuff - make, autoconf, gcc, git etc. If anyone gets stuck here I might be able to help.

Next, grab libX11:

$ git clone https://gitlab.freedesktop.org/xorg/lib/libx11
$ cd libx11
$ ./autogen.sh
$ ./configure CFLAGS='-g -O0'
$ make # -j4 etc

Now grab x11vnc:

$ git clone https://github.com/LibVNC/x11vnc
$ cd x11vnc

Edit src/cleanup.c and modify (currently line 67) crash_debug = 0 to 1. This flag is hardcoded in the source and cannot be reached by any commandline options.

Build it:

$ ./autogen.sh
$ ./configure CFLAGS='-g -O0' --x-libraries=../libx11/.libs --x-includes=../libx11/include
$ make # -j4 etc

I added the X library/include path arguments for completeness, I'm not sure if they're actually doing anything (lol). For Reasons™ I don't care to figure out/reason about the linker will still (understandably) prioritize the system libX11, so just

LD_PRELOAD=../libx11/src/.libs/libX11.so.6 ./src/x11vnc

to get the linker's attention without needing to sudo make install the newly built libX11 all over the system. Thankfully it's that simple in this case :D

You can concretely verify you picked up the locally-built library correctly:

$ LD_PRELOAD=../libx11/src/.libs/libX11.so.6 ldd ./src/x11vnc | head
        linux-vdso.so.1 (0x00007ffc457b0000)
        ../libx11/src/.libs/libX11.so.6 (0x00007fa6b98f9000)

I'd actually recommend double-checking this real quick to be sure because LD_PRELOAD is more of a guide than a directive, and the linker will happily (silently!!) plow right past whatever it says as long as it can find the libraries it wants somewhere. ldd shows what actually got loaded.

OK.

...This is the part where you hopefully have a straightforward way to trigger the crash. :crying_cat_face:

For me all I need to do is mouse back and forth over the tabs in a Chrome window so the little preview popup appears, or even just scroll a webpage (with smooth scrolling enabled, to benefit the local experience) - the resulting flurry of animation seems to be a fairly sure-fire trigger for the hardware I'm working with.

Many users seem to need to wait for hours for this bug to trigger. If you find you can reproduce this very easily/quickly (rare?) you may be able to be of particular assistance with fixing this - see if there are any comments below with ideas you can try.

I don't have any significant first-order "go and try this" suggestions to offer myself. I'm instead currently looking at bodging x11vnc into cooperating (or using the TigerVNC Xvnc server module, which is just a tad hiccup-ey for my liking), and to that end, I may have identified a couple of potential novel x11vnc workarounds:

  • Using a compositing manager may alleviate the problem. The logic in XGetSubImage_wr() may use an alternate code path... https://github.com/LibVNC/x11vnc/blob/4ee62dace18c365012d4a0c168923ffec6222fda/src/xwrappers.c#L355-L364 ...under some circumstances. The alternate path may not always be used depending on various factors (that I'm not sure how to reason about), and even if it does get used, it falls through to the old broken code if it fails. I describe this as uncertainly as I do because I don't know how consistently if(use_xcomposite && subwin && !rootshift) { will be TRUE, and how reliable the XGetImage() call it uses will be (and thus how reliably it won't fall through to the current logic that likes to explode).
  • copy_image will never call XGetSubImage_wr() if -snapfb is in use: https://github.com/LibVNC/x11vnc/blob/4ee62dace18c365012d4a0c168923ffec6222fda/src/xwrappers.c#L847 Theoretically -snapfb is pretty heavyweight in terms of performance and efficiency as it copies the entire screen, but on the relatively old laptop I'm encountering this on I find x11vnc seems (?) to use similar amounts of CPU (I can't really tell since the non--snapfb code path just doesn't work for long enough lol).
  • It's possible the -onetile option may influence the copying logic such that it copies fewer tiles at a time, which may trigger the bug less frequently, and potentially even not at all.

In full credit I think I noticed a forum post mentioning that -snapfb fixed their use case, but of course now I can't re-find it.


Following is a bit of commentary about some observations that may be of interest to developers and motivated users.

Unpacking the backtrace as best as I can, I guess the interesting part starts at _XReply. Why is it bailing out?

https://github.com/mirror/libX11/blob/918063298cb893bee98040c9dca45ccdb2864773/src/xcb_io.c#L794-L799:

    794         /* it's not an error, but we don't have a reply, so it's an I/O
    795          * error. */
    796         if(!reply) {
    797                 _XIOError(dpy);
    798                 return 0;
    799         }

Noo. Why don't we have a reply?

...That's a really good question, so I asked XCB:

+   678         printf("seq=%ld\n", current->sequence);
    679 
    680         /* Don't let any other thread get this reply. */
    681         current->reply_waiter = 1;
    682 
+   683         printf("{{\n");
    684 
    685         while(1)
    686         {
    ...
    697                 UnlockDisplay(dpy);
    698 
+   699                 printf("req seq=%ld\n", req->sequence);
    700 
    701                 response = xcb_wait_for_reply64(c, req->sequence, &error);
    702 
+   703                 printf("response=%p error=%p\n", response, error);
    ...
    751         }
    752 
+   753         printf("}}\n");
    754 
    755         if (!check_internal_connections(dpy))
    756                 return 0;

This was rewarded with:

...
...
seq=1435
{{
req seq=1434
response=(nil) error=(nil)
req seq=1435
response=0x56399c23cb30 error=(nil)
}}
seq=1436
{{
req seq=1435
response=(nil) error=(nil)
req seq=1436
response=0x56399c3e7d80 error=(nil)
}}
seq=1437
{{
req seq=1436
response=(nil) error=(nil)
req seq=1437
response=0x56399c071ab0 error=(nil)
}}
seq=1438
{{
req seq=1437
response=(nil) error=(nil)
req seq=1438
response=0x56399c3ea660 error=(nil)
}}
seq=1439
{{
req seq=1438
response=(nil) error=(nil)
req seq=1439
response=0x56399c3ebde0 error=(nil)
}}
seq=1440
{{
req seq=1439
response=(nil) error=(nil)
req seq=1440
response=(nil) error=(nil)
}}
caught XIO error:

   *** Welcome to the x11vnc crash shell! ***

Wat. There really are no replies when it crashes.

Unfortunately (but perhaps predictably/understandably) this is where I'm out of the game in terms of difficulty level :) I'm extremely happy to go poking around and follow someone else's nose though :D

Thinking about the situation a bit myself, I wonder if this is reader/writer race condition. For example, maybe Chrome is updating the screen (SHM region, presumably) at the exact same nanosecond x11vnc is trying to read it... potentially across concurrent threads... and perhaps a barrier/mutex/synchronization fence in the X server isn't locking where it should be, or something. Perhaps this is happening because XDAMAGE is feeding x11vnc "this has changed" events faster than the X server can marshal/arbitrate access to them, or maybe so many additional screen updates are occurring around recently-posted XDAMAGE update coordinates that attempts to read the screen are falling within the trigger boundaries of a particularly tight race condition of some kind that only rears its head in circumstances of very precise timing.

In any case, I mused that one fairly abrupt solution to this problem might be to patch libX11 to "peek" into the reply queue (yay... probably not the first time that's been suggested) to see if there's actually a response to read... and bailing out early if there isn't, which could be totally acceptable because the API for XGetImage() can legally return NULL to signal failure.

That might work. Or it might not fix the actual problem, which might be request/reply desync instead. I tried commenting out the _XIOError() in xcb_io.c (the one I quoted above) to see what would happen, given that the associated if() code path promptly exits out of the function with NULL (as per allowed API behavior - sounds hopeful). This had interesting and disconcerting results:

...
...
seq=6800
{{
req seq=6799
response=(nil) error=(nil)
req seq=6800
response=0x562c44869ad0 error=(nil)
}}
seq=6801
{{
req seq=6800
response=(nil) error=(nil)
req seq=6801
response=0x562c44869ad0 error=(nil)
}}
seq=6802
{{
req seq=6801
response=(nil) error=(nil)
req seq=6802
response=0x562c44869ad0 error=(nil)
}}
...
...
seq=6838
{{
req seq=6838
response=(nil) error=(nil)
}}
seq=6839
{{
req seq=6839
response=(nil) error=(nil)
}}
seq=6840
{{
req seq=6840
response=(nil) error=(nil)
}}
seq=6841
{{
req seq=6841
response=(nil) error=(nil)
}}
seq=6842
{{
req seq=6842
response=(nil) error=(nil)
}}
[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
x11vnc: xcb_io.c:269: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
caught signal: 6

   *** Welcome to the x11vnc crash shell! ***

PROGRAM: ./src/x11vnc  PID: 188929

POSSIBLE DEBUGGER COMMAND:

  gdb ./src/x11vnc 188929

Press "q" to quit.
Press "h" or "?" for this help.
Press "s" to try to run some commands to show a stack trace (gdb/pstack).

Anything else is passed to -Q query function.


crash> q
quitting.
17/03/2022 03:02:30 deleted 43 tile_row polling images.

   *** Welcome to the x11vnc crash shell! ***

PROGRAM: ./src/x11vnc  PID: 188929

POSSIBLE DEBUGGER COMMAND:

  gdb ./src/x11vnc 188929

Press "q" to quit.
Press "h" or "?" for this help.
Press "s" to try to run some commands to show a stack trace (gdb/pstack).

Anything else is passed to -Q query function.


crash> q
quitting.

This demonstrates several things:

  1. The function finagles its way through several thousand "successful" (??) requests/replies, returning NULL as appropriate, for a very short period (in wall clock terms, about 4 point 2 nanoseconds ;)), and *then* crashes into an XCB sequence desync.

  2. It works and then it doesn't, which means it seems to be able to viably survive for short periods of time returning NULL and bailing. The less-"oh no" interpretation is that the couple bajillionths of a second it runs for are influenced by how long it's taking for the X server to poke some new data out over the client socket... but that doesn't explain the request sequence number iteration :'v

  3. When it exits it crashes twice (noooo), which strongly suggests to me that a mutex got out of sync somewhere... possibly, if I understand pthreads correctly? Can *checks known-incomplete notes* processes dying release mutexes in other processes? (And while I'm at it, how am I interacting with the crash shell twice, synchronously, in sequence... I thought threads didn't have this sort of implicit coordinational capability... *eyeglaze*)

  4. Maybe (naive "common-sense" incoming) the server is not able to provide a sequence reply quickly enough... within whatever window of accceptability is defined by libX11...? (*more eyeglaze*)

Despite being completely out of my depth with all this (I'm trying to train myself to tolerate the feeling of flustered disorientation :)) and right on the limit bounds of my attention span, I did notice the reference to xtruss in https://github.com/TigerVNC/tigervnc/issues/869#issuecomment-540949831. The last time I tried playing with this program (while learning about the X11 protocol \o/) it wedged Xorg ;) so initially I resorted to xscope which I'd had better experiences with... only to find it decided every reply from x11vnc was *INVALID* (guess it doesn't know about XDAMAGE and XSHM), so I punted and fired xtruss up via SSH.

For completeness, xtruss did turn up something tiny and curious. It doesn't happen all the time, so this could be a size-XL red herring, but sometimes, crashes happen exactly at the moment a SendEvent-generated event goes past.

I can reproduce it somewhat consistently:.

05c00000: GetImage(drawable=wp#00000110, x=1344, y=352, width=22, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFF1F
1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=1344, y=384, width=22, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFF1F
1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=1344, y=416, width=22, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFF1F
1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=1344, y=448, width=22, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFF1F
1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=1344, y=480, width=22, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=46FF005B:00000110:05C0002F:0B297374:00000000:02E80556:00000000:03000
556:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:...}                                                                      
05c00000: --- SendEvent-generated DAMAGE:UnknownEvent22                                                                                                                                                                       
05c00000: GetImage(drawable=wp#00000110, x=1344, y=512, width=22, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFF1F
1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFF1F1F1:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            

Again:

05c00000: ShmGetImage(drawable=wp#00000110, x=1152, y=96, width=32, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap, shmseg=0x05C00004, offset=0x00000000) = {depth=24, visual=v#00000020, size=4096}
05c00000: ShmGetImage(drawable=wp#00000110, x=1184, y=96, width=32, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap, shmseg=0x05C00004, offset=0x00000000) = {depth=24, visual=v#00000020, size=4096}
05c00000: ShmGetImage(drawable=wp#00000110, x=1216, y=96, width=32, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap, shmseg=0x05C00004, offset=0x00000000) = {depth=24, visual=v#00000020, size=4096}
05c00000: ShmGetImage(drawable=wp#00000110, x=1248, y=96, width=32, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap, shmseg=0x05C00004, offset=0x00000000) = {depth=24, visual=v#00000020, size=4096}
05c00000: ShmGetImage(drawable=wp#00000110, x=1280, y=96, width=32, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap, shmseg=0x05C00004, offset=0x00000000) = {depth=24, visual=v#00000020, size=4096}                        
05c00000: GetImage(drawable=wp#00000110, x=1312, y=96, width=54, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFF
FF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                       
05c00000: GetImage(drawable=wp#00000110, x=64, y=128, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=64, y=160, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=2073005B:00000110:05C0002F:0B2D4C04:00000000:02E80556:00000000:03000
556:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                           
05c00000: --- SendEvent-generated DAMAGE:UnknownEvent22                                                                                                                                                                       
05c00000: GetImage(drawable=wp#00000110, x=64, y=192, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                           
05c00000: --- DAMAGE:UnknownEvent0                                                                             
05c00000: --- DAMAGE:UnknownEvent0
05c00000: --- DAMAGE:UnknownEvent0                                                                             
09600000:  ... RECORD:UnknownExtensionRequest5(bytes=8) = {<unable to decode reply data>}                        
05c00000: --- DAMAGE:UnknownEvent0                                                                             
09600000:  ... RECORD:UnknownExtensionRequest5(bytes=8) = {<unable to decode reply data>}                        
05c00000: --- DAMAGE:UnknownEvent0

A random outlier:

05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: GetImage(drawable=wp#00000110, x=64, y=608, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=64, y=640, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=64, y=672, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=64, y=704, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=22C0005B:00000110:05C0002F:0B321E21:00000000:02E80556:00000000:03000
556:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                           
05c00000: --- XInputExtension:UnknownEvent14                                                                                                                                                                                  
05c00000: GetImage(drawable=wp#00000110, x=64, y=736, width=1302, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                           
05c00000: --- DAMAGE:UnknownEvent0
05c00000: --- DAMAGE:UnknownEvent0
05c00000: --- DAMAGE:UnknownEvent0                                                                             
09600000:  ... RECORD:UnknownExtensionRequest5(bytes=8) = {<unable to decode reply data>}                        
05c00000: --- DAMAGE:UnknownEvent0                                                                             

Yet again:

05c00000: GetImage(drawable=wp#00000110, x=896, y=544, width=470, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: GetImage(drawable=wp#00000110, x=896, y=576, width=470, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=4EA1005B:00000110:05C0002F:0B33C0B1:00000000:02E80556:00000000:03000
556:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                           
05c00000: --- SendEvent-generated DAMAGE:UnknownEvent22                                                                                                                                                                       
05c00000: GetImage(drawable=wp#00000110, x=896, y=608, width=470, height=32, plane-mask=0xFFFFFFFF, format=ZPixmap) = {depth=24, visual=v#00000020, image-data=FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFF
FFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF:...}                                                                      
05c00000: --- DAMAGE:UnknownEvent0                                                                             
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            
05c00000: --- DAMAGE:UnknownEvent0                                                                             
05c00000: --- DAMAGE:UnknownEvent0                                                                                                                                                                                            

A cursory look around finds that x11vnc does very little with XSendEvents()... suggesting that those aren't really SendEvents-generated <whatever they say they are>s, but rather xtruss is instead demonstrating Very Bad™ protocol desync. I'm not sure how to verify this hypothesis though.

To close, I've discovered that this issue seems to reproduce under the rr debugger, even though rr interferes with SHM and -noshm must be passed to x11vnc for it to start - it actually launches and crashes on command :) and the replay works and everything. Sadly the caveat emptor with rr is of course that recordings don't seem to be portable between machines (which makes sense but is a shame), and there's also the fact that rr is kinda conducive to shorter repro cycles given that longer sessions will store a lot of data and take time to replay from scratch every relaunch. I wonder if it could be interesting for affected users to provide SSH access to a locked-down account that can access the recordings?

The chronology (that I can find) of this bug is curious. It seems to have been around for a while, yet gotten intolerable especially recently. Here are all the threads I was able to find on the subject:

  • https://askubuntu.com/questions/288647/vnc-using-lightdm-and-x11vnc-disconnects-immediately-after-valid-login (Apr 2013)
  • https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815909 (Feb 2016)
  • https://forums.gentoo.org/viewtopic-t-1066066-start-0.html (Jul 2017)
  • https://github.com/TigerVNC/tigervnc/issues/869 (describing what appears to be the exact same failure mode in TigerVNC's x0vncserver) (interesting) (Sep 2019)
  • https://archived.forum.manjaro.org/t/x11vnc-vncviewer-close-connection-or-show-rfb-98/115423 (Dec 2019)
  • https://forums.freebsd.org/threads/x11vnc-does-not-work-as-it-should-after-upgrade-from-11-to-12-1-version-crashes-every-time-after-2-3-days.73400/ (someone experiencing the issue on FreeBSD - so X and x11vnc are running in a fairly different environment) (Dec 2019)
  • https://stackoverflow.com/questions/64751230/x11vnc-connection-dropping-unexpectedly (Nov 2020)
  • Issue #147 here (Nov 2020)
  • This issue (Dec 2020, a month later)
  • https://github.com/LibVNC/x11vnc/issues/113 (Feb 2021)
  • https://bugs.archlinux.org/task/71685 (Aug 2021)

I am very curious what the bug turns out to be.

exikyut avatar Mar 16 '22 18:03 exikyut

@exikyut can you do a PR or a patch with the workaround ?

sergiomb2 avatar Mar 16 '22 20:03 sergiomb2

@sergiomb2 That's a great thought and I can see where you're coming from, but I don't know enough about how x11vnc works internally to blanket-recommend novel behavioral alterations to the source code that go beyond the current commandline options.

I also wanted to throw out breadcrumbs that would be more toward the accessible end (ie not requiring users to rebuild x11vnc from source) to raise the likelihood people would try them out and report how they went.

If I was going to double-down on editing the source code, I'd probably focus on trying to make copy_image() always use XShmGetImage() by seeing if w == dest->width && h == dest->height can be made to always be true, eg by creating a temporary XImage that's always the correct size.

https://github.com/LibVNC/x11vnc/blob/4ee62dace18c365012d4a0c168923ffec6222fda/src/xwrappers.c#L860-L864

I'm not sure how relevant , or how performant it would be.


Also, I repeatedly forgot to squish this in somewhere before hitting Comment - I'm experiencing this on a Gateway NE56R laptop containing a Pentium B960 CPU with Intel HD Graphics 2000. Fairly oldish hardware (2012), and relatively low- to mid-range on the performance spectrum (2 cores, no HT) - but I actually wonder if this may contribute to why this problem triggers so readily on this machine.

exikyut avatar Mar 17 '22 00:03 exikyut

Two things:

  1. I had a go at bisecting x11vnc, and came up with https://github.com/LibVNC/x11vnc/compare/3a12472..f72a0d2. This is sadly just a configuration bump where the XDAMAGE #define was being renamed and the code and build config weren't using the same name, so XDAMAGE support was compiled out. Given that this was from all the way back in Dec 2010, and that I was marking commits from ~before that point as bad (because they were crashing almost immediately), I think I can be pretty sure that this is not actually a problem with x11vnc but rather something else, libx11 or maybe the X server itself.

  2. This machine doesn't really have the RAM to comfortably run a concurrent second instance of Chrome so I closed my session to move it over to a copy of Xvfb as a test. Unfortunately when I moved Chrome back to the system X server I found that x11vnc suddenly behaving extremely reliably :/ and isn't crashing at all. This means that Chrome's behavior and interactions with the X server are implicated somehow, perhaps via something to do with SHM. Welp; I can't really do very much until this begins consistently reproducing again.

exikyut avatar Mar 17 '22 03:03 exikyut

I use my gitlog alias alias gitlog='git --no-pager log --pretty=tformat:"%C(yellow)%h %C(cyan)%ad %Cblue%an%C(auto)%d %Creset%s" --graph --date=format:"%Y-%m-%d %H:%M" -25'

8e416dad782042e077c721e820a51268c5a3ea66 is the begging of x11vnc with autools

gitlog cbb5c4f -50

  • cbb5c4f 2014-09-09 10:57 Christian Beier Add prepare_x11vnc_dist.sh from libvncserver repo.
  • df228dc 2014-09-03 20:50 Christian Beier Add tightvnc-1.3dev5-vncviewer-alpha-cursor.patch.
  • 65bde7d 2014-09-03 20:04 Christian Beier Fix make dist.
  • b158137 2014-09-03 19:55 Christian Beier HAVE_FBPM is not a libvncserver autoconf test but specific to x11vnc.
  • 45899ca 2014-09-03 19:54 Christian Beier IRIX_XREADDISPLAY is not a libvncserver autoconf test but specific to x11vnc.
  • 910713c 2014-09-03 19:53 Christian Beier HAVE SOLARIS_XREADSCREEN is not a libvncserver autoconf test but specific to x11vnc.
  • 8761f8c 2014-09-03 19:52 Christian Beier HAVE_LIBXTRAP is not a libvncserver autoconf test but specific to x11vnc.
  • e6acc14 2014-09-03 19:51 Christian Beier HAVE_RECORD is not a libvncserver autoconf test but specific to x11vnc.
  • 3a12472 2014-09-03 19:49 Christian Beier HAVE_LIBXDAMAGE is not a libvncserver autoconf test but specific to x11vnc.
  • f72a0d2 2014-09-03 19:48 Christian Beier HAVE_LIBXFIXES is not a libvncserver autoconf test but specific to x11vnc.
  • c0ba582 2014-09-03 19:48 Christian Beier HAVE_LIBXRANDR is not a libvncserver autoconf test but specific to x11vnc.
  • 471dc57 2014-09-03 19:47 Christian Beier HAVE_LIBXINERAMA is not a libvncserver autoconf test but specific to x11vnc.
  • 2f7332e 2014-09-03 19:46 Christian Beier HAVE_XKEYBOARD is not a libvncserver autoconf test but specific to x11vnc.
  • 045265d 2014-09-03 19:43 Christian Beier HAVE_XTEST is not a libvncserver autoconf test but specific to x11vnc.
  • 737791a 2014-09-03 19:40 Christian Beier HAVE_XSHM is not a libvncserver autoconf test but specific to x11vnc.
  • b656c98 2014-09-03 19:34 Christian Beier The define is now called HAVE_WAITPID without the LIBVNCSERVER_ prefix.
  • c9be8a0 2014-09-03 19:30 Christian Beier Allow building with OpenSSL even when libvncserver was not built with OpenSSL.
  • e55aabc 2014-09-03 19:19 Christian Beier Remove last libvncserver specific autoconf tests.
  • 0a705f5 2014-09-03 17:07 Christian Beier Actually honour that X11 is available.
  • d287ed9 2014-09-03 17:01 Christian Beier Remove some of the libvnc specific autoconf tests.
  • 69f488e 2014-09-03 16:45 Christian Beier We always build with external libvnc[server|client] now. First successful build!
  • 2144763 2014-09-03 16:24 Christian Beier Do not prefix the generated configure header defines.
  • 02259ba 2014-09-03 15:44 Christian Beier Use now included font header in unixpw.c.
  • 31597ca 2014-09-03 15:17 Christian Beier Fix for unknown sockaddr_in type.
  • a17b3ce 2014-09-03 13:09 Christian Beier Add font header not shipped with libvncserver.
  • 8e416da 2014-09-03 13:01 Christian Beier Re-add autotools build system, taken from libvncserver. WIP!
  • a5a4477 2014-09-03 12:24 Christian Beier Add AUTHORS file, req'd by autotools.
  • 196cd2f 2014-09-03 12:23 Christian Beier Rename RELEASE-NOTES to NEWS, as this file is required by autotools anyway.
  • 18512dc 2014-08-16 16:20 dscho Merge pull request #16 from sandsmark/master |
    | * 031e5a4 2014-07-10 14:34 Will Thompson x11vnc: fix double X_UNLOCK on xrandr events
  • | 0111777 2014-07-10 14:34 Will Thompson x11vnc: fix double X_UNLOCK on xrandr events |/
  • 8947292 2014-05-13 19:09 dextero x11vnc: adjust blackout region coordinates to the clipping region
  • 73cfc1b 2012-03-10 11:38 DRC Fix the build of x11vnc when an out-of-tree build directory is used
  • 18b2f97 2010-12-29 10:05 runge x11vnc: Use opengl to read screen on macosx. non-deprecated macosx interfaces for input injection.
  • bb7360d 2010-12-21 14:31 runge x11vnc: force --with-system-libvncserver to use correct headers.
  • 187cb9f 2010-12-21 12:04 runge x11vnc: touchscreen uinput support and Java viewer mousewheel support. See x11vnc/ChangeLog for rest.
  • 2df8ffb 2010-09-10 14:26 runge update to x11vnc 0.9.12
  • 3703b84 2010-05-08 19:53 runge x11vnc: tweaks to prepare_x11vnc_dist.sh. set cd->unixname in apply_opts().

Development of XDAMAGE was in 2005

gitlog cbb5c4f -500 | grep -i damage

  • 3a12472 2014-09-03 19:49 Christian Beier HAVE_LIBXDAMAGE is not a libvncserver autoconf test but specific to x11vnc.
  • 0a5d494 2007-02-10 21:52 runge x11vnc: watch textchat, etc in unixpw, implement kbdReleaseAllKeys, setSingleWindow, setServerInput. watch for OpenGL apps breaking XDAMAGE.
  • 3908f75 2007-01-07 20:05 runge changes to ncache cache aging and xdamage skipping
  • 93982b9 2005-03-12 22:14 runge x11vnc: X DAMAGE support, -clip WxH+X+Y, identd.

sergiomb2 avatar Mar 17 '22 05:03 sergiomb2

Hi, I had a vague idea that someone said that not see the bug with debug (-dbg)

on Fedora 35 now I'm running x11vnc -localhost -display :0 --repeat -dbg without any error for some weeks (i.e. with xdamage enabled) . But can't say runnig without errors is exclusive with -dbg

Looking for the x11vnc package on Fedora 35 , just libX11-1.7.3 is new, libvncserver-0.9.13 have some patches but 2 years old, libXdamage other the other X libs seems don't have nothing new for years.

sergiomb2 avatar Jun 12 '22 05:06 sergiomb2

I'm seeing this "caught XIO" issue too, it's only recently started doing this. Been working fine for years up until today.

The VNC desktop is:      localhost:0
PORT=5900
14/06/2022 15:44:55 Got connection from client ::1
14/06/2022 15:44:55   other clients:
14/06/2022 15:44:55 Normal socket connection
14/06/2022 15:44:55 check_access: client addr ::1 is local.
14/06/2022 15:44:55 Disabled X server key autorepeat.
14/06/2022 15:44:55   to force back on run: 'xset r on' (3 times)
14/06/2022 15:44:55 incr accepted_client=1 for ::1:53984  sock=10
14/06/2022 15:44:55 Client Protocol Version 3.8
14/06/2022 15:44:55 Protocol version sent 3.8, using 3.8
14/06/2022 15:44:55 rfbProcessClientSecurityType: executing handler for type 1
14/06/2022 15:44:55 rfbProcessClientSecurityType: returning securityResult for client rfb version >= 3.8
14/06/2022 15:44:55 Pixel format for client ::1:
14/06/2022 15:44:55   32 bpp, depth 24, little endian
14/06/2022 15:44:55   true colour: max r 255 g 255 b 255, shift r 16 g 8 b 0
14/06/2022 15:44:55 no translation needed
14/06/2022 15:44:55 Using compression level 4 for client ::1
14/06/2022 15:44:55 Using image quality level 7 for client ::1
14/06/2022 15:44:55 Using JPEG subsampling 0, Q86 for client ::1
14/06/2022 15:44:55 Enabling X-style cursor updates for client ::1
14/06/2022 15:44:55 Enabling full-color cursor updates for client ::1
14/06/2022 15:44:55 Enabling cursor position updates for client ::1
14/06/2022 15:44:55 Enabling LastRect protocol extension for client ::1
14/06/2022 15:44:55 Enabling NewFBSize protocol extension for client ::1
14/06/2022 15:44:55 Using tight encoding for client ::1
14/06/2022 15:44:56 client 1 network rate 895.5 KB/sec (169956.4 eff KB/sec)
14/06/2022 15:44:56 client 1 latency:  42.0 ms
14/06/2022 15:44:56 dt1: 0.0732, dt2: 0.5335 dt3: 0.0420 bytes: 524450
14/06/2022 15:44:56 link_rate: LR_UNKNOWN - 42 ms, 895 KB/s
14/06/2022 15:44:56 client_set_net: ::1  0.0000
14/06/2022 15:44:56 created   xdamage object: 0x400040
14/06/2022 15:44:56 copy_tiles: allocating first_line at size 61
14/06/2022 15:44:59 set_ncache_xrootpmap: trying root background
14/06/2022 15:44:59 snapshotting background...
14/06/2022 15:44:59 done.
caught XIO error:
14/06/2022 15:44:59 deleted 60 tile_row polling images.

Happens 100%, displays desktop then 2 seconds later the caught IXO error occurs and I'm thrown out.

x11vnc version: 0.9.16 lastmod: 2019-01-05 (server running on KDE Neon User Edition 5.24 (Ubuntu 20.04 Focal)) For the viewer I'm using SSVNC 1.0.29 (client also running on KDE Neon User Edition 5.24 (Ubuntu 20.04 Focal))

PartialVolume avatar Jun 14 '22 15:06 PartialVolume

Happens to me on Archlinux as well, x11vnc package version 1:0.9.16-5 (libvnc is version 0.9.13).

When a client connects to the server, it connects successfully and works for a few seconds but then the server logs a fatal exception and terminates:

18/07/2022 18:23:04 Client requested resolution change to (1278x772)
18/07/2022 18:23:04 Sending rfbEncodingExtDesktopSize for size (760x760) resize prohibited
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEC6)
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0x574D5664)
18/07/2022 18:23:08 Enabling full-color cursor updates for client 192.168.2.105
18/07/2022 18:23:08 Enabling X-style cursor updates for client 192.168.2.105
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0x574D5666)
18/07/2022 18:23:08 Enabling NewFBSize protocol extension for client 192.168.2.105
18/07/2022 18:23:08 Enabling ExtDesktopSize protocol extension for client 192.168.2.105
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFECD)
18/07/2022 18:23:08 Enabling LastRect protocol extension for client 192.168.2.105
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xC0A1E5CE)
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEC7)
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEC8)
18/07/2022 18:23:08 rfbProcessClientNormalMessage: ignoring unsupported encoding type Enc(0xFFFFFEFE)
18/07/2022 18:23:08 Using compression level 2 for client 192.168.2.105
18/07/2022 18:23:08 Using image quality level 6 for client 192.168.2.105
18/07/2022 18:23:08 Using JPEG subsampling 0, Q79 for client 192.168.2.105
18/07/2022 18:23:08 Switching from tight to tight Encoding for client 192.168.2.105
caught XIO error:
18/07/2022 18:23:12 deleted 24 tile_row polling images.

-xnodagame does indeed fix this problem.

Funny but it worked fine for a week and only manifested today. Maybe due to connection interference between client and server that this issue has a chance to occur?

dbedrenko avatar Jul 19 '22 00:07 dbedrenko

Just started using x11vnc again today and am still getting the caught XIO error and then connection closing. Version 0.9.16 on both client and server. Any progress on what this might be. Using a different VNC viewer i.e. Remmina rather than SVNC viewer makes no difference. Any recommended work around other than the -xdamage or is that found to fix the problem?

PartialVolume avatar Oct 21 '22 10:10 PartialVolume

On Fedora 35 now I'm running x11vnc -localhost -display :0 --repeat since June and I don't saw any XIO error , on KDE Xorg sessions (not wayland). Client vncviewer is Ubuntu 18.04 LTS . One thing that I find out is that we need run DISPLAY=:0 xset dpms force on to wake up the screen after the screen has been turned off by "energy saving"

sergiomb2 avatar Oct 21 '22 11:10 sergiomb2

Initial results seem to show that specifying the option -noxdamage fixes the problem for me, however it would be interesting to discover the root cause of why xdamage bombs out with a caught XIO error.

PartialVolume avatar Oct 21 '22 13:10 PartialVolume

Hey guys, so I have a VM running CentOS 7. For a while x11vnc was running just fine. It's been a while since I had to use x11vnc so I am not sure what broke it and I need your help to fix it.

The version I have is x11vnc version: 0.9.13

The problem I have is that I run x11vnc, then from my laptop I connect to my VM through, I get prompted to enter the password for the user and when I should see the desktop the connection drops with the below lines: caught XIO error: 12/06/2023 14:12:57 deleted 40 tile_row polling images.

I tried running with different attributes x11vnc -noxdamage but no luck. Any tips what else I could try? Thank you for your patience and help.

imneofit avatar Jun 12 '23 13:06 imneofit

hi @imneofit , RedHat el7 is near to EOL , June 30, 2024 , less than one year , 2 year ago I asked to update x11vnc (https://src.fedoraproject.org/rpms/x11vnc/pull-request/2 ) on epel 7 but it wasn't approve, my suggestion is build 0.9.16 for epel 7 . Since this may be difficult for you, I did it, https://copr.fedorainfracloud.org/coprs/sergiomb/builds_for_Stable_Releases/package/x11vnc/ ,

yum install yum-plugin-copr
yum copr enable sergiomb/builds_for_Stable_Releases
yum install x11vnc

sergiomb2 avatar Jun 13 '23 10:06 sergiomb2

Thanks a lot @sergiomb2. I literally fixed the problem a few moments ago. What I did was to remove GNOME, x11vnc, and a few other things. I put them back and it seems to be OK for now.

imneofit avatar Jun 13 '23 11:06 imneofit

Still happenes, especially when you logout in vnc session

     Loaded: loaded (/etc/systemd/system/x11vnc.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Wed 2023-11-15 14:54:38 CST; 16s ago
   Duration: 5min 58.269s
    Process: 637 ExecStart=/usr/bin/x11vnc -display :0 -auth guess -N -no6 -nevershared -forever -xkb -noxdamage -speeds 6,1500,40 -ping 5 -passwd NOYOUDONT (cod>
   Main PID: 637 (code=exited, status=3)
        CPU: 6.507s

11月 15 14:53:22 EbkTnkCentr x11vnc[637]: 15/11/2023 14:53:22 Sending rfbEncodingExtDesktopSize for size (1024x768)
11月 15 14:53:25 EbkTnkCentr x11vnc[637]: 15/11/2023 14:53:25 created selwin: 0x200024
11月 15 14:53:25 EbkTnkCentr x11vnc[637]: 15/11/2023 14:53:25 called initialize_xfixes()
11月 15 14:53:52 EbkTnkCentr x11vnc[637]: 15/11/2023 14:53:52 Sending rfbEncodingExtDesktopSize for size (1024x768)
11月 15 14:54:22 EbkTnkCentr x11vnc[637]: 15/11/2023 14:54:22 Sending rfbEncodingExtDesktopSize for size (1024x768)
11月 15 14:54:38 EbkTnkCentr x11vnc[637]: caught XIO error:
11月 15 14:54:38 EbkTnkCentr x11vnc[637]: 15/11/2023 14:54:38 deleted 32 tile_row polling images.
11月 15 14:54:38 EbkTnkCentr systemd[1]: x11vnc.service: Main process exited, code=exited, status=3/NOTIMPLEMENTED
11月 15 14:54:38 EbkTnkCentr systemd[1]: x11vnc.service: Failed with result 'exit-code'.
11月 15 14:54:38 EbkTnkCentr systemd[1]: x11vnc.service: Consumed 6.507s CPU time.```

eebssk1 avatar Nov 15 '23 06:11 eebssk1

Process: 637 ExecStart=/usr/bin/x11vnc -display :0 -auth guess -N -no6 -nevershared -forever -xkb -noxdamage -speeds 6,1500,40 -ping 5 -passwd NOYOUDONT (cod>

x11vnc -localhost -display :0 -repeat -forever -nomodtweak

is important libX11 updated 1.8.7 and also libvncserver 0.9.13 with patches https://src.fedoraproject.org/rpms/libvncserver/tree/0c8fd29b3ec33ba126db38bee311226d21bf131a

is working for me without problems

sergiomb2 avatar Nov 15 '23 11:11 sergiomb2

Process: 637 ExecStart=/usr/bin/x11vnc -display :0 -auth guess -N -no6 -nevershared -forever -xkb -noxdamage -speeds 6,1500,40 -ping 5 -passwd NOYOUDONT (cod>

x11vnc -localhost -display :0 -repeat -forever -nomodtweak

is important libX11 updated 1.8.7 and also libvncserver 0.9.13 with patches https://src.fedoraproject.org/rpms/libvncserver/tree/0c8fd29b3ec33ba126db38bee311226d21bf131a

is working for me without problems

I checked debian uses libX11 1.8.4 but to 1.8.7 there' only security fixes. Debian is using libvncserver 0.9.14 but I don't sure if it contain patches you provided.

eebssk1 avatar Nov 15 '23 13:11 eebssk1

yes , I just check all patches except https://github.com/LibVNC/libvncserver/pull/234 (TLS security type enablement patches) are included on 0.9.14

I mean also use -repeat and -nomodtweak to remove old hacks

sergiomb2 avatar Nov 15 '23 14:11 sergiomb2