wayvnc icon indicating copy to clipboard operation
wayvnc copied to clipboard

NVidia: Core dump with `--gpu` option when connecting

Open Ra2-IFV opened this issue 1 year ago • 22 comments

Useful information:

Please, try to gather as much of useful information as possible and follow these instructions:

  • Version:
wayvnc: v0.9.1-e4ec935 (makepkg)
neatvnc: v0.9.1-cc19604 (makepkg)
aml: v0.3.0-0-gb83f357 (makepkg)
  • Try to reproduce while capturing a trace log:
Info: Capturing output DP
Info: >> 0000000000000 (DP) 1920x1080+0x0 Power:UNKNOWN
DEBUG: ../wayvnc/src/ctl-server.c: 809: Initializing wayvncctl socket: /run/user/1000/wayvncctl
DEBUG: ../wayvnc/src/ctl-server.c: 778: Connecting to existing socket in case it's stale
DEBUG: ../wayvnc/src/ctl-server.c: 785: Connect failed: Connection refused
Warning: ../wayvnc/src/ctl-server.c: 788: Deleting stale control socket path "/run/user/1000/wayvncctl"
DEBUG: ../neatvnc/src/server.c: 2114: Trying address: 0.0.0.0
DEBUG: ../neatvnc/src/server.c: 2129: Successfully bound to address
Info: Listening for connections on 0.0.0.0:5901
Info: New client connection from 127.0.0.1: 0x574e94a77620
DEBUG: ../neatvnc/src/server.c: 357: Client chose security type: 1
DEBUG: ../wayvnc/src/main.c: 1640: Configuring cursor capturing
DEBUG: ../wayvnc/src/main.c: 1656: Failed to capture cursor
Info: Starting screen capture
DEBUG: ../wayvnc/src/main.c: 1030: Acquired power state management. Waiting for power event to start capturing
DEBUG: ../wayvnc/src/main.c: 1383: Client connected, new client count: 1
DEBUG: ../wayvnc/src/ctl-server.c: 941: Enqueueing client-connected event: {"id":"1","address":"127.0.0.1","username":null,"seat":"seat0","connection_count":1}
DEBUG: ../wayvnc/src/ctl-server.c: 968: Enqueued client-connected event for 0 clients
Info: Client 0x574e94a77620 initialised. MIN-RTT during handshake was 0 ms
DEBUG: ../neatvnc/src/server.c: 673: Client 0x574e94a77620 set encodings: cursor,desktop-size,extended-desktop-size,qemu-led-state,vmware-led-state,extended-clipboard,continuous-updates,fence,qemu-extended-key-event,tight,copyrect,open-h264,zrle,hextile,rre,copyrect,raw
DEBUG: ../neatvnc/src/server.c: 2686: Keyboard LED state changed: ffffffff -> 0
Info: Choosing tight encoding for client 0x574e94a77620
DEBUG: ../neatvnc/src/server.c: 1630: Sending extended desktop resize rect: 1920x1080
DEBUG: ../wayvnc/src/buffer.c: 606: Reconfiguring buffer pool
DEBUG: ../wayvnc/src/buffer.c: 552: Using render node: /dev/dri/renderD128
DEBUG: ../neatvnc/src/server.c: 2686: Keyboard LED state changed: 0 -> 2
Segmentation fault (core dumped)
  • Get the stack trace:
Stack trace of thread 10000:
#0  0x00007bd52d680581 n/a (libc.so.6 + 0x16c581)
#1  0x00007bd52d786d5a nvnc_display_feed_buffer (libneatvnc.so.0 + 0x18d5a)
#2  0x000058c7d3306eab n/a (wayvnc + 0xbeab)
#3  0x000058c7d330bdba n/a (wayvnc + 0x10dba)
#4  0x00007bd52d50e596 n/a (libffi.so.8 + 0x7596)
#5  0x00007bd52d50b00e n/a (libffi.so.8 + 0x400e)
#6  0x00007bd52d50dbd3 ffi_call (libffi.so.8 + 0x6bd3)
#7  0x00007bd52d7a58b0 n/a (libwayland-client.so.0 + 0x48b0)
#8  0x00007bd52d7a6139 n/a (libwayland-client.so.0 + 0x5139)
#9  0x00007bd52d7a6553 wl_display_dispatch_queue_pending (libwayland-client.so.0 + 0x5553)
#10 0x000058c7d33094fd n/a (wayvnc + 0xe4fd)
#11 0x00007bd52d7da61f aml_dispatch (libaml.so.0 + 0x361f)
#12 0x000058c7d33015b1 n/a (wayvnc + 0x65b1)
#13 0x00007bd52d539e08 n/a (libc.so.6 + 0x25e08)
#14 0x00007bd52d539ecc __libc_start_main (libc.so.6 + 0x25ecc)
#15 0x000058c7d3302405 n/a (wayvnc + 0x7405)
  • Describe how to reproduce the problem

Command

  • wayvnc --gpu

Config

address=0.0.0.0
port=5901
use_relative_paths=true

Ra2-IFV avatar Dec 02 '24 14:12 Ra2-IFV

Sadly, that stack trace is useless to me without line numbers. Please follow the instructions to get a useful strack trace.

any1 avatar Dec 02 '24 14:12 any1

GDB tells no debug symbols found. How to enable them? I didn't see options for this while compiling. And could you tell me how to get the dump after wayvnc is terminated, if you are generous enough?

Ra2-IFV avatar Dec 03 '24 12:12 Ra2-IFV

any help is appreciated...

Ra2-IFV avatar Feb 07 '25 07:02 Ra2-IFV

Follow the build instructions in README.md. Instead of meson build, say meson setup build --buildtype=debug.

any1 avatar Feb 07 '25 12:02 any1

wayvnc: v0.9.1-e4ec935 (makepkg) neatvnc: v0.9.2-4c37ae9 (makepkg) aml: v0.3.0-0-gb83f357 (makepkg)

Here is a stack trace with line numbers:

#0  __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265
#1  0x00007ffff7da4574 in memcpy (__dest=<optimized out>, __src=<optimized out>, __len=<optimized out>, __dest=<optimized out>, __src=<optimized out>, __len=<optimized out>) at /usr/include/bits/string_fortified.h:29
#2  XXH_memcpy (dest=<optimized out>, src=0x80, size=<optimized out>) at ../neatvnc/include/xxhash.h:2361
#3  XXH3_update (state=<optimized out>, input=0x80 <error: Cannot access memory at address 0x80>, len=<optimized out>, f_acc=<optimized out>, f_scramble=<optimized out>) at ../neatvnc/include/xxhash.h:6282
#4  XXH3_64bits_update (state=<optimized out>, input=0x80, len=<optimized out>) at ../neatvnc/include/xxhash.h:6336
#5  damage_hash_tile (self=<optimized out>, tx=<optimized out>, ty=<optimized out>, buffer=<optimized out>) at ../neatvnc/src/damage-refinery.c:91
#6  damage_refine_tile (self=<optimized out>, refined=0x7fffffff3980, tx=<optimized out>, ty=<optimized out>, buffer=<optimized out>) at ../neatvnc/src/damage-refinery.c:109
#7  damage_refine (self=<optimized out>, refined=0x7fffffff3980, hint=0x80, buffer=<optimized out>) at ../neatvnc/src/damage-refinery.c:156
#8  nvnc_display_feed_buffer (self=<optimized out>, fb=<optimized out>, damage=damage@entry=0x7fffffff3a90) at ../neatvnc/src/display.c:128
#9  0x000055555555e468 in wayvnc_process_frame (self=0x7fffffff4230, buffer=0x5555555d55b0) at ../wayvnc/src/main.c:1148
#10 on_capture_done (result=<optimized out>, buffer=0x5555555d55b0, userdata=0x7fffffff4230) at ../wayvnc/src/main.c:1170
#11 0x0000555555560ff6 in screencopy_ready (data=0x5555555d1e10, frame=<optimized out>, sec_hi=<optimized out>, sec_lo=<optimized out>, nsec=<optimized out>) at ../wayvnc/src/screencopy.c:228
#12 0x00007ffff7b06596 in ffi_call_unix64 () at ../src/x86/unix64.S:104
#13 0x00007ffff7b0300e in ffi_call_int (cif=cif@entry=0x7fffffff3d10, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
#14 0x00007ffff7b05bd3 in ffi_call (cif=cif@entry=0x7fffffff3d10, fn=<optimized out>, rvalue=rvalue@entry=0x0, avalue=avalue@entry=0x7fffffff3de0) at ../src/x86/ffi64.c:710
#15 0x00007ffff7dbf8b0 in wl_closure_invoke (closure=closure@entry=0x5555555c7e30, target=<optimized out>, target@entry=0x5555555d1ef0, opcode=opcode@entry=0x2, data=<optimized out>, flags=0x1) at ../wayland-1.23.1/src/connection.c:1228
#16 0x00007ffff7dc0139 in dispatch_event (display=display@entry=0x5555555c1fd0, queue=queue@entry=0x5555555c20c8) at ../wayland-1.23.1/src/wayland-client.c:1674
#17 0x00007ffff7dc0553 in dispatch_queue (display=0x5555555c1fd0, queue=0x5555555c20c8) at ../wayland-1.23.1/src/wayland-client.c:1820
#18 wl_display_dispatch_queue_pending (display=0x5555555c1fd0, queue=0x5555555c20c8) at ../wayland-1.23.1/src/wayland-client.c:2062
#19 0x00007ffff7dc05c1 in wl_display_dispatch_pending (display=<optimized out>) at ../wayland-1.23.1/src/wayland-client.c:2125
#20 0x000055555556436d in on_wayland_event (obj=<optimized out>) at ../wayvnc/src/main.c:520
#21 0x00007ffff7dea61f in aml__handle_event (self=<optimized out>, obj=0x5555555c7230) at ../aml/src/aml.c:801
#22 aml_dispatch (self=self@entry=0x5555555c1d30) at ../aml/src/aml.c:853
#23 0x000055555555a470 in main (argc=<optimized out>, argv=<optimized out>) at ../wayvnc/src/main.c:2146

net147 avatar Feb 08 '25 20:02 net147

I've got almost absolutely the same stuff. Running wayvnc with --gpu flag results in crash with this log:

Info: Capturing output HDMI-A-1
Info: >> Samsung Electric Company LF27T35 H4ZR900240 (HDMI-A-1) 1920x1080+0x0 Power:UNKNOWN
DEBUG: ../wayvnc/src/ctl-server.c: 809: Initializing wayvncctl socket: /run/user/1000/wayvncctl
DEBUG: ../neatvnc/src/server.c: 2128: Trying address: 192.168.1.77
DEBUG: ../neatvnc/src/server.c: 2143: Successfully bound to address
Info: Listening for connections on 192.168.1.77:5900
Info: New client connection from 192.168.1.4: 0x57dca8b99050
DEBUG: ../neatvnc/src/server.c: 357: Client chose security type: 1
DEBUG: ../wayvnc/src/main.c: 1640: Configuring cursor capturing
DEBUG: ../wayvnc/src/main.c: 1656: Failed to capture cursor
Info: Starting screen capture
DEBUG: ../wayvnc/src/main.c: 1030: Acquired power state management. Waiting for power event to start capturing
DEBUG: ../wayvnc/src/main.c: 1383: Client connected, new client count: 1
DEBUG: ../wayvnc/src/ctl-server.c: 941: Enqueueing client-connected event: {"id":"1","address":"192.168.1.4","username":null,"seat":"Hyprland","connection_count":1}
DEBUG: ../wayvnc/src/ctl-server.c: 968: Enqueued client-connected event for 0 clients
Info: Client 0x57dca8b99050 initialised. MIN-RTT during handshake was 1 ms
DEBUG: ../wayvnc/src/buffer.c: 606: Reconfiguring buffer pool
DEBUG: ../wayvnc/src/buffer.c: 552: Using render node: /dev/dri/renderD128
DEBUG: ../neatvnc/src/server.c: 676: Client 0x57dca8b99050 set encodings: zrle,trle,hextile,rre,raw,copyrect,cursor,desktop-size
DEBUG: ../neatvnc/src/server.c: 539: Using color palette for client 0x57dca8b99050
DEBUG: ../neatvnc/src/server.c: 553: Client 0x57dca8b99050 chose pixel format: RGB222
Info: Choosing zrle encoding for client 0x57dca8b99050
DEBUG: ../neatvnc/src/server.c: 676: Client 0x57dca8b99050 set encodings: raw,zrle,trle,hextile,rre,copyrect,cursor,desktop-size
Info: Choosing raw encoding for client 0x57dca8b99050
DEBUG: ../neatvnc/src/server.c: 676: Client 0x57dca8b99050 set encodings: zrle,trle,hextile,rre,raw,copyrect,cursor,desktop-size
DEBUG: ../neatvnc/src/server.c: 539: Using color palette for client 0x57dca8b99050
DEBUG: ../neatvnc/src/server.c: 553: Client 0x57dca8b99050 chose pixel format: XRGB8888
Info: Choosing zrle encoding for client 0x57dca8b99050
[1]    1855 segmentation fault (core dumped)  wayvnc 192.168.1.77 5900 -Ltrace --gpu |
       1856 done                              tee wayvnc-crash.log

WayVnc Version:

wayvnc: v0.9.1-e4ec935 (makepkg)
neatvnc: v0.9.2-4c37ae9 (makepkg)
aml: v0.3.0-0-gb83f357 (makepkg)

wayzaction avatar Feb 24 '25 23:02 wayzaction

Do we have enough information for now?

Ra2-IFV avatar Mar 14 '25 17:03 Ra2-IFV

Neat VNC v0.9.4 has some bug fixes, so you're probably better off running that. However, I suspect that these may be due to out of bounds access to DMA-BUFs from the GPU.

any1 avatar Mar 14 '25 18:03 any1

Still crashing in my case.

On March 14, 2025 6:14:05 PM UTC, Andri Yngvason @.***> wrote:

any1 left a comment (any1/wayvnc#360)

Neat VNC v0.9.4 has some bug fixes, so you're probably better off running that. However, I suspect that these may be due to out of bounds access to DMA-BUFs from the GPU.

-- Reply to this email directly or view it on GitHub: https://github.com/any1/wayvnc/issues/360#issuecomment-2725430528 You are receiving this because you authored the thread.

Message ID: @.***>

Ra2-IFV avatar Mar 30 '25 04:03 Ra2-IFV

DEBUG: ../subprojects/neatvnc/src/server.c: 361: Client chose security type: 1
DEBUG: ../src/main.c: 1764: Configuring cursor capturing
DEBUG: ../src/main.c: 1780: Failed to capture cursor
Info: Starting screen capture
DEBUG: ../src/main.c: 1154: Acquired power state management. Waiting for power event to start capturing
DEBUG: ../src/main.c: 1507: Client connected, new client count: 1
DEBUG: ../src/ctl-server.c: 941: Enqueueing client-connected event: {"id":"1","address":"CLIENT_IP","username":null,"seat":"seat0","connection_count":1}
DEBUG: ../src/ctl-server.c: 968: Enqueued client-connected event for 0 clients
Info: Client 0x5555555d3d00 initialised. MIN-RTT during handshake was 3 ms
TRACE: ../src/output.c: 280: Output DP-1 power state changed to ON
TRACE: ../src/main.c: 1207: Output DP-1 power state changed to ON
DEBUG: ../src/buffer.c: 607: Reconfiguring buffer pool
DEBUG: ../src/buffer.c: 553: Using render node: /dev/dri/renderD128
TRACE: ../src/buffer.c: 394: wv_buffer_create: 1920x1080, stride: 0, format: 875713112
DEBUG: ../subprojects/neatvnc/src/server.c: 545: Using color palette for client 0x5555555d3d00
DEBUG: ../subprojects/neatvnc/src/server.c: 559: Client 0x5555555d3d00 chose pixel format: XRGB8888
DEBUG: ../subprojects/neatvnc/src/server.c: 683: Client 0x5555555d3d00 set encodings: raw,cursor,desktop-size,extended-desktop-size,qemu-extended-key-event,extended-clipboard
Info: Choosing raw encoding for client 0x5555555d3d00
TRACE: ../src/main.c: 1254: Passing on buffer: 0x5555555f8b90

Thread 1 "wayvnc" received signal SIGSEGV, Segmentation fault.
0x00007ffff7c8f241 in ?? () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7c8f241 in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff7f9d04a in XXH_memcpy (dest=<optimized out>, src=0x80, size=128)
    at ../subprojects/neatvnc/include/xxhash.h:2361
#2  XXH3_update (state=<optimized out>, input=0x80 <error: Cannot access memory at address 0x80>,
    len=128, f_acc=<optimized out>, f_scramble=<optimized out>)
    at ../subprojects/neatvnc/include/xxhash.h:6282
#3  XXH3_64bits_update (state=<optimized out>, input=0x80, len=128)
    at ../subprojects/neatvnc/include/xxhash.h:6336
#4  damage_hash_tile (self=<optimized out>, tx=1, ty=<optimized out>, buffer=0x5555555f87c0)
    at ../subprojects/neatvnc/src/damage-refinery.c:91
#5  damage_refine_tile (self=<optimized out>, refined=0x7fffffff2b40, tx=1, ty=<optimized out>,
    buffer=0x5555555f87c0) at ../subprojects/neatvnc/src/damage-refinery.c:109
#6  damage_refine (self=<optimized out>, refined=0x7fffffff2b40, hint=0x85ebca77, buffer=<optimized out>)
    at ../subprojects/neatvnc/src/damage-refinery.c:156
#7  nvnc_display_feed_buffer (self=<optimized out>, fb=<optimized out>,
    ***@***.***=0x7fffffff2cd0) at ../subprojects/neatvnc/src/display.c:128
#8  0x000055555556a67b in wayvnc_process_frame (self=0x7fffffff3450, buffer=0x5555555f8b90)
    at ../src/main.c:1272
#9  on_capture_done (result=<optimized out>, buffer=0x5555555f8b90, userdata=0x7fffffff3450)
    at ../src/main.c:1294
#10 0x00005555555706cf in screencopy_ready (data=0x5555555dfe00, frame=<optimized out>,
    sec_hi=<optimized out>, sec_lo=<optimized out>, nsec=<optimized out>) at ../src/screencopy.c:228
#11 0x00007ffff7a12976 in ?? () from /usr/lib/libffi.so.8
#12 0x00007ffff7a0f12c in ?? () from /usr/lib/libffi.so.8
#13 0x00007ffff7a11f0e in ffi_call () from /usr/lib/libffi.so.8
#14 0x00007ffff7d918b0 in ?? () from /usr/lib/libwayland-client.so.0
#15 0x00007ffff7d92139 in ?? () from /usr/lib/libwayland-client.so.0
#16 0x00007ffff7d92553 in wl_display_dispatch_queue_pending () from /usr/lib/libwayland-client.so.0
#17 0x000055555556ce7f in on_wayland_event (handler=<optimized out>) at ../src/main.c:514
#18 0x00007ffff7fbaaee in aml__handle_event (self=<optimized out>, obj=0x5555555de540)
    at ../subprojects/aml/src/aml.c:928
#19 aml_dispatch ***@***.***=0x5555555cbec0) at ../subprojects/aml/src/aml.c:980
#20 0x0000555555573df9 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.c:2287

Ra2-IFV avatar Mar 30 '25 04:03 Ra2-IFV

Do you have multiple GPUs on the system?

any1 avatar Mar 30 '25 09:03 any1

No.

On March 30, 2025 9:24:54 AM UTC, Andri Yngvason @.***> wrote:

any1 left a comment (any1/wayvnc#360)

Do you have multiple GPUs on the system?

-- Reply to this email directly or view it on GitHub: https://github.com/any1/wayvnc/issues/360#issuecomment-2764469417 You are receiving this because you authored the thread.

Message ID: @.***>

Ra2-IFV avatar Mar 30 '25 16:03 Ra2-IFV

im having this issue as well. also single gpu

tcvdh avatar May 05 '25 13:05 tcvdh

Ran it in gdb. Looks like the problem is gbm_bo_map is returning null

TRACE: ../src/output.c: 280: Output HDMI-A-2 power state changed to ON
TRACE: ../src/main.c: 1248: Output HDMI-A-2 power state changed to ON
DEBUG: ../src/buffer.c: 607: Reconfiguring buffer pool
DEBUG: ../src/buffer.c: 553: Using render node: /dev/dri/renderD128
TRACE: ../src/buffer.c: 394: wv_buffer_create: 2560x1440, stride: 0, format: 875713112
DEBUG: ../subprojects/neatvnc/src/server.c: 2835: Keyboard LED state changed: ffffffff -> 0
Info: Choosing tight encoding for client 0x5555555999f0
TRACE: ../src/main.c: 1333: Processing buffer: 0x5555555f03b0

Thread 1 "wayvnc" hit Breakpoint 1, 0x00007ffff7ec1bc4 in gbm_bo_map ()
   from /nix/store/9g1k7zk2h6013f1a7mdvi72f261pihym-mesa-libgbm-25.1.0/lib/libgbm.so.1
(gdb) finish
Run till exit from #0  0x00007ffff7ec1bc4 in gbm_bo_map ()
   from /nix/store/9g1k7zk2h6013f1a7mdvi72f261pihym-mesa-libgbm-25.1.0/lib/libgbm.so.1
0x00007ffff7f943ca in nvnc_fb_map (fb=0x5555555ba950)
    at ../subprojects/neatvnc/src/fb.c:258
258		fb->addr = gbm_bo_map(fb->bo, 0, 0, fb->width, fb->height,
(gdb) next
260		fb->stride = stride / nvnc_fb_get_pixel_size(fb);
(gdb) print fb->addr
$1 = (void *) 0x0
(gdb)

RazeLighter777 avatar Oct 15 '25 20:10 RazeLighter777

Ahh, I was lazy and didn't check for an error from nvnc_fb_map in damage refinery. Maybe Neat VNC should just panic when gbm_bo_map fails.

any1 avatar Oct 15 '25 22:10 any1

Anyway, this is an NVidia bug and I can't do anything about it other than improve error reporting so people will at least know why the thing crashed.

any1 avatar Oct 15 '25 22:10 any1

Is there any way that I can make a fix? https://developer.nvidia.com/docs/drive/drive-os/6.0.9/public/drive-os-linux-sdk/common/topics/window_system_stub/BufferAllocation120.html nvidia uses this api on their docs

RazeLighter777 avatar Oct 15 '25 22:10 RazeLighter777

Well, apparently, they claim to support gbm_bo_map, but it's not working. If you replace GBM_BO_TRANSFER_READ with GBM_BO_TRANSFER_READ_WRITE, does it make a difference?

In any case, you don't have encoding via VA-API on nvidia, so there's not really much to be gained from using GPU buffers here. The gbm_bo_map code path is a fallback for when h264 is not used. It's pretty fast on architectures where GPU and CPU memory is shared, but you might be better off with glReadPixels otherwise (which is what happens when you don't set the --gpu flag).

any1 avatar Oct 16 '25 09:10 any1

Would adding nvenc support be an option?

RazeLighter777 avatar Oct 16 '25 12:10 RazeLighter777

Would adding nvenc support be an option?

I will not be adding nvenc, but others are free to do so.

any1 avatar Oct 16 '25 12:10 any1

I will not be adding nvenc, but others are free to do so.

Because it's not freedom? Or hard to supporting it?

Ra2-IFV avatar Oct 16 '25 18:10 Ra2-IFV

I will not be adding nvenc, but others are free to do so.

Because it's not freedom? Or hard to supporting it?

  • I have enough things to work on as is
  • Most of those things are more interesting than this
  • I don't have any nv hardware, so I wouldn't be able to test and iterate
  • I do not want to have any nv hardware as it would just create more clutter in my life

any1 avatar Oct 16 '25 18:10 any1