libwebsockets icon indicating copy to clipboard operation
libwebsockets copied to clipboard

Segfault in lws_context_destroy / lws_destroy_event_pipe (LWS 4.1.6) when WSI count is 0 or at least one was successfully closed

Open cyanide-burnout opened this issue 1 year ago • 3 comments

2023-03-26 23:18:06 LWS: LWS: 4.1.6-, loglevel -1
2023-03-26 23:18:06 LWS: NET CLI SRV H1 H2 WS IPV6-on
2023-03-26 23:18:06 LWS: lws_create_context: ev lib path /usr/lib/x86_64-linux-gnu
2023-03-26 23:18:06 LWS:    /usr/lib/x86_64-linux-gnu/libwebsockets-evlib_glib.so
2023-03-26 23:18:06 LWS: _realloc: size 24: lws_plat_dlopen
2023-03-26 23:18:06 LWS: Event loop: glib
2023-03-26 23:18:06 LWS: _realloc: size 6600: context
2023-03-26 23:18:06 LWS: _realloc: size 72: lws_smd_register
2023-03-26 23:18:06 LWS: lws_smd_register: registered
2023-03-26 23:18:06 LWS: _realloc: size 8192: fds table
2023-03-26 23:18:06 LWS:  ctx:  5704B (1608 ctx + pt(1 thr x 4096)), pt-fds: 1024, fdmap: 8192
2023-03-26 23:18:06 LWS:  http: ah_data: 4096, ah: 976, max count 1024
2023-03-26 23:18:06 LWS: _realloc: size 8192: lws_lookup
2023-03-26 23:18:06 LWS:  mem: platform fd map:  8192 B
2023-03-26 23:18:06 LWS: _realloc: size 864: event pipe wsi
2023-03-26 23:18:06 LWS: lws_role_transition: 0x562831afcff0: wsistate 0x200, ops pipe
2023-03-26 23:18:06 LWS: event pipe fd 15
2023-03-26 23:18:06 LWS: __insert_wsi_socket_into_fds: 0x562831afcff0: tsi=0, sock=15, pos-in-fds=0
2023-03-26 23:18:06 LWS: elops_io_glib: wsi 0x562831afcff0, fd 15, 0x5/0x19
2023-03-26 23:18:06 LWS:  Compiled with OpenSSL support
2023-03-26 23:18:06 LWS: Doing SSL library init
2023-03-26 23:18:06 LWS:  canonical_hostname = worm
2023-03-26 23:18:06 LWS: _realloc: size 712: lws_create_vhost
2023-03-26 23:18:06 LWS: _realloc: size 112: vhost-specific plugin table
2023-03-26 23:18:06 LWS: _realloc: size 24: same vh list
2023-03-26 23:18:06 LWS: Creating Vhost 'default' (serving disabled), 1 protocols, IPv6 on
2023-03-26 23:18:06 LWS: _realloc: size 72: client ctx tcr
2023-03-26 23:18:06 LWS: lws_tls_client_create_vhost_context: vh default: created new client ctx 0
2023-03-26 23:18:06 LWS: created client ssl context for default
2023-03-26 23:18:06 LWS:  LWS_MAX_EXTENSIONS_ACTIVE: 1
2023-03-26 23:18:06 LWS:  mem: per-conn:          840 bytes + protocol rx buf
2023-03-26 23:18:06 LWS: lws_plat_drop_app_privileges: not changing group
2023-03-26 23:18:06 LWS: lws_plat_drop_app_privileges: not changing user
2023-03-26 23:18:06 LWS: lws_cancel_service
...
2023-03-26 23:18:06 LWS: lws_client_interpret_server_handshake: no content length
2023-03-26 23:18:06 LWS: lws_client_int_s_hs: no protocol list
2023-03-26 23:18:06 LWS: lws_client_ws_upgrade: WSI_TOKEN_PROTOCOL is null
2023-03-26 23:18:06 LWS: Selected protocol default
2023-03-26 23:18:06 LWS: no client extensions allowed by server
2023-03-26 23:18:06 LWS: lws_ensure_user_space: 0x562831afcff0 protocol pss 0, user_space=0x562831c51050
2023-03-26 23:18:06 LWS: __lws_header_table_detach: wsi 0x562831afcff0: ah 0x562831c511b0 (tsi=0, count = 1)
2023-03-26 23:18:06 LWS: __lws_header_table_detach: nobody usable waiting
2023-03-26 23:18:06 LWS: _lws_destroy_ah: freed ah 0x562831c511b0 : pool length 0
2023-03-26 23:18:06 LWS: __lws_header_table_detach: wsi 0x562831afcff0: ah 0x562831c511b0 (tsi=0, count = 0)
2023-03-26 23:18:06 LWS: lws_role_transition: 0x562831afcff0: wsistate 0x10000119, ops ws
2023-03-26 23:18:06 LWS: _lws_validity_confirmed_role: wsi 0x562831afcff0: setting validity timer 300s (hup 0)
2023-03-26 23:18:06 LWS: _realloc: size 4116: client frame buffer
2023-03-26 23:18:06 LWS: handshake OK for protocol default
...
2023-03-26 23:18:06 LWS: __lws_close_free_wsi: 0x562831afcff0: caller: close_and_handled
2023-03-26 23:18:06 LWS: __lws_close_free_wsi: real just_kill_connection: 0x562831afcff0 (sockfd 17)
2023-03-26 23:18:06 LWS: elops_io_glib: wsi 0x562831afcff0, fd 17, 0x8000000b/0x8
2023-03-26 23:18:06 LWS: elops_io_glib: wsi 0x562831afcff0, fd 17, 0xb/0x8
2023-03-26 23:18:06 LWS: lwsi_set_state(0x562831afcff0, 0x10000020)
2023-03-26 23:18:06 LWS: lws_vhost_unbind_wsi: vh default: count_bound_wsi 0
2023-03-26 23:18:06 LWS: __lws_free_wsi: 0x562831afcff0, remaining wsi 0, tsi fds count 0
...
2023-03-26 23:18:16 LWS: lws_context_destroy: ctx 0x562831b45960
2023-03-26 23:18:16 LWS: _lws_state_transition: system: changed 12 'OPERATIONAL' -> 13 'POLICY_INVALID'
2023-03-26 23:18:16 LWS: lws_destroy_event_pipe
2023-03-26 23:18:16 Got signal Segmentation fault (11) on thread test (TID 35318)
2023-03-26 23:18:16 Stack trace:
#1  <unknown> (/lib/x86_64-linux-gnu/libwebsockets.so.17)
#2  lws_destroy_event_pipe (/lib/x86_64-linux-gnu/libwebsockets.so.17)
	+ ./obj-x86_64-linux-gnu/lib/./lib/core-net/vhost.c:996.2
#3  lws_context_destroy (/lib/x86_64-linux-gnu/libwebsockets.so.17)
	+ ./obj-x86_64-linux-gnu/lib/./lib/core/context.c:1589.2

It does not happen when at least one WSI still exists on destruction. GLib, single thread. GMainLoop and GMainContext destructors are AFTER LWS destruction.

Had no such problem with LWS 4.0.20 when used custom event loop by using LWS_CALLBACK_*_POLL_FD

cyanide-burnout avatar Mar 26 '23 21:03 cyanide-burnout

Does this problem still exist on main branch lws?

lws-team avatar Mar 27 '23 04:03 lws-team

Tested. It seems like the main branch has no such problem. I can guess it was fixed in 4.3.x. Unfortunately upcoming version of Debian 12 has only 4.1.6. I created a small patch in my code:

void ReleaseLWSCore(struct LWSCore* core)
{
  const char* version;
  struct lws_client_connect_info information;

  version = lws_get_library_version();

  if ((version[3] == '.') &&
      (strncmp(version, "4.1.", 4) >= 0) &&
      (strncmp(version, "4.2.", 4) <= 0))
  {
    // Very stupid workaround for graceful stop on LWS 4.1.6
    // https://github.com/warmcat/libwebsockets/issues/2857

    memset(&information, 0, sizeof(struct lws_client_connect_info));

    information.context  = core->context;
    core->maximal       -= core->current;              // Amount of WSIs sill open
    core->maximal        = MAXIMUM(core->maximal, 1);  // Maximal detected amount of WSIs

    while (core->maximal > 0)
    {
      lws_client_connect_via_info(&information);
      core->maximal --;
    }
  }

  lws_context_destroy(core->context);
  free(core);
}

Cannot move out from using GLib main loop since it is probably only one of listed by LWS that allows to have integration with external main loops.

cyanide-burnout avatar Mar 27 '23 15:03 cyanide-burnout

Finally found this: https://github.com/warmcat/libwebsockets/blob/35674b9f35208ecec63d936577dbabe87549ba1e/lib/core/context.c#L1758

Use lws_context_destroy2() helps :) But I did not find this in docs :)

cyanide-burnout avatar Apr 01 '23 12:04 cyanide-burnout