irssi icon indicating copy to clipboard operation
irssi copied to clipboard

Segfault on sending large message

Open spaghetti2514 opened this issue 7 years ago • 47 comments

When I try to print the entire christian bible to a channel (with flood protection disabled on the server and client), irssi segfaults part of the way through.

If I use /exec -out cat bible.txt to print the bible, irssi segfaults. If I use a different command that breaks the file into ~10 chunks and sleeps for two seconds between printing chunks, the entire thing is successfully sent.

Steps to reproduce:

  • run the above command with a file of approximately the same size as the entire bible.

Expected behavior:

  • Entire bible is sent to channel
  • irssi does not segfault

Actual behavior:

  • Part of the bible is sent to channel
  • irssi segfaults

I am using irssi 1.0.5-1 (20171020 1715) from the debian testing repositories, but this issue has persisted across several versions.

Please let me know if there is any additional information that would be useful for me to provide.

spaghetti2514 avatar Nov 30 '17 17:11 spaghetti2514

Works for me using ftp://ftp.cs.princeton.edu/pub/cs226/textfiles/bible.txt (3.9mb)

Get a backtrace https://wiki.debian.org/HowToGetABacktrace

dequis avatar Nov 30 '17 17:11 dequis

@dequis

#0  flood_newmsg (server=0x555555b96a10, level=4, nick=0x5555559b6c41 "iled", host=0x5555559b6c4a "", target=0x555556f25600 "#test") at flood.c:196
#1  0x00005555555e053f in flood_privmsg (server=0x555555b96a10, data=<optimized out>, nick=0x5555559b6c41 "iled", addr=0x5555559b6c4a "") at flood.c:255
#2  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x555555870350, params=params@entry=4, va=va@entry=0x7fffffffe1d0, first_hook=<optimized out>) at signals.c:242
#3  0x00005555555fa60d in signal_emit (signal=<optimized out>, params=params@entry=4) at signals.c:286
#4  0x00005555555c0858 in irc_server_event (server=0x555555b96a10, line=0x5555559b6c6c "", nick=0x5555559b6c41 "iled", address=0x5555559b6c4a "") at irc.c:308
#5  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x5555558743d0, params=params@entry=4, va=va@entry=0x7fffffffe390, first_hook=<optimized out>) at signals.c:242
#6  0x00005555555fa6e5 in signal_emit_id (signal_id=<optimized out>, params=4) at signals.c:304
#7  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x55555586ed00, params=params@entry=2, va=va@entry=0x7fffffffe500, first_hook=<optimized out>) at signals.c:242
#8  0x00005555555fa6e5 in signal_emit_id (signal_id=<optimized out>, params=params@entry=2) at signals.c:304
#9  0x00005555555c09b5 in irc_parse_incoming (server=0x555555b96a10) at irc.c:383
#10 0x00005555555eb84d in irssi_io_invoke (source=<optimized out>, condition=<optimized out>, data=<optimized out>) at misc.c:55
#11 0x00007ffff6ba8dd5 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x00007ffff6ba91a0 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#13 0x00007ffff6ba922c in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x000055555557701f in main (argc=<optimized out>, argv=<optimized out>) at irssi.c:335

Got to 1 Kings 11:43 before the crash

spaghetti2514 avatar Nov 30 '17 17:11 spaghetti2514

What quit message do other people see in the channel?

Does this also happen with a clean profile? irssi --home=~/.irssi-test/

dequis avatar Nov 30 '17 19:11 dequis

What quit message do other people see in the channel?

A few seconds after the messages stop, irssi segfaults and -!- test [[email protected]] has quit [Connection closed] is sent to the channel

Does this also happen with a clean profile?

With a completely clean profile, flood protection stops me from trying this. After turning off flood protection, I send the entire thing, but only so much of it actually shows up to others in the channel (same as was happening before) but a few seconds later instead of segfaulting, the connection is just closed. Irssi says "Connection lost" and the channel displays the same quit message that it was when segfaulting occurred (has quit [Connection closed]).

I'm surprised the behavior changed, since the only real changes to my config besides turning off flood protection are colorschemes and aliases.

spaghetti2514 avatar Nov 30 '17 21:11 spaghetti2514

Okay this is interesting. For reference, I'm testing this with a "test" client where I actually send the bible text from, and another client (the "viewing" client) joined to the same channel where I observe externally.

When I send the bible text in the test client, I see it all get sent to the channel within a second or two. From the viewing client, I see the text streaming in rapidly for several seconds, before abruptly halting at some point well before the end of the text.

At this point, I usually wait for a few seconds and then the client segfaults or gets disconnected (depending on config), but I just realized that this always coincides with someone else in the channel saying or doing something that gets sent to the test client. In the case of the original config where segfaulting occurs, anyone saying anything in the channel or quitting the channel after the text has halted triggers the segfault. With the fresh config, it triggers a disconnect.

spaghetti2514 avatar Nov 30 '17 21:11 spaghetti2514

Here's another backtrace

#0  flood_newmsg (server=0x555555b96c40, level=4, nick=0x5555559b6c41 "", host=0x5555559b6c4b "table_lookup: assertion 'hash_table != NULL' failed", 
    target=0x555555d6bf80 "#test") at flood.c:196
#1  0x00005555555e053f in flood_privmsg (server=0x555555b96c40, data=<optimized out>, nick=0x5555559b6c41 "", 
    addr=0x5555559b6c4b "table_lookup: assertion 'hash_table != NULL' failed") at flood.c:255
#2  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x555555870350, params=params@entry=4, va=va@entry=0x7fffffffe1d0, first_hook=<optimized out>) at signals.c:242
#3  0x00005555555fa60d in signal_emit (signal=<optimized out>, params=params@entry=4) at signals.c:286
#4  0x00005555555c0858 in irc_server_event (server=0x555555b96c40, line=0x5555559b6c6b "ble != NULL' failed", nick=0x5555559b6c41 "", 
    address=0x5555559b6c4b "table_lookup: assertion 'hash_table != NULL' failed") at irc.c:308
#5  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x5555558743d0, params=params@entry=4, va=va@entry=0x7fffffffe390, first_hook=<optimized out>) at signals.c:242
#6  0x00005555555fa6e5 in signal_emit_id (signal_id=<optimized out>, params=4) at signals.c:304
#7  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x55555586ed00, params=params@entry=2, va=va@entry=0x7fffffffe500, first_hook=<optimized out>) at signals.c:242
#8  0x00005555555fa6e5 in signal_emit_id (signal_id=<optimized out>, params=params@entry=2) at signals.c:304
#9  0x00005555555c09b5 in irc_parse_incoming (server=0x555555b96c40) at irc.c:383
#10 0x00005555555eb84d in irssi_io_invoke (source=<optimized out>, condition=<optimized out>, data=<optimized out>) at misc.c:55
#11 0x00007ffff6ba8dd5 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x00007ffff6ba91a0 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#13 0x00007ffff6ba922c in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x000055555557701f in main (argc=<optimized out>, argv=<optimized out>) at irssi.c:335

spaghetti2514 avatar Nov 30 '17 22:11 spaghetti2514

are there any scripts loaded which could explain the difference between the configs, or any other config differences?

which client crashes, the one who sends the bible?

ailin-nemui avatar Dec 05 '17 15:12 ailin-nemui

are there any scripts loaded which could explain the difference between the configs, or any other config differences?

The only scripts I have loaded are a few simple things that register a command. Nothing that listens to all lines, incoming or outgoing, like I would expect something to do if it causes a segfault from high volume text. The only config differences are also very minor - just a few aliases an a colorscheme. When I have time later I'll try enabling things in my normal config bit by bit in a fresh config and see at what point it starts crashing.

which client crashes, the one who sends the bible?

Yes

spaghetti2514 avatar Dec 05 '17 15:12 spaghetti2514

the odd thing in your backtrace is that it crashes in some incoming message, so it should be unrelated to your sending of the bible. (but maybe that sending causes some memory corruption somewhere? just random speculation from me..)

are you trying it with the bible dequis linked?

do you have the possibility to get a wireshark capture of the network traffic before it crashes?

#4  0x00005555555c0858 in irc_server_event (server=0x555555b96c40, line=0x5555559b6c6b "ble != NULL' failed", nick=0x5555559b6c41 "", 
    address=0x5555559b6c4b "table_lookup: assertion 'hash_table != NULL' failed") at irc.c:308

already does not look like a real line that should be coming in from the IRC server

if you go to the first frame (f 0) and p *server, is the server record intact? you can also try to show us a bt full

ailin-nemui avatar Dec 05 '17 15:12 ailin-nemui

the odd thing in your backtrace is that it crashes in some incoming message

Yeah, it's always the first incoming message after the bible text stops getting sent.

are you trying it with the bible dequis linked?

I don't think they're exactly the same bible, but they are nearly the same size and it's all just normal ascii text.

if you go to the first frame (f 0) and p *server, is the server record intact?

(gdb) p *server
$1 = {type = 716, chat_type = 66, refcount = 1, connrec = 0x555555ab3680, connect_time = 1512491572, real_connect_time = 1512491574, tag = 0x555555c86b30 "test", 
  nick = 0x555555ccdde0 "test", connected = 1, disconnected = 1, connection_lost = 1, session_reconnect = 0, no_reconnect = 0, handle = 0x0, readtag = -1, connect_pipe = {0x0, 0x0}, 
  connect_tag = -1, connect_pid = 2281, rawlog = 0x555555c858e0, module_data = 0x555555a6a060, version = 0x555555ab4570 "InspIRCd-2.2", away_reason = 0x0, last_invite = 0x0, 
  server_operator = 1, usermode_away = 0, banned = 0, dns_error = 0, lag_sent = {tv_sec = 0, tv_usec = 0}, lag_last_check = 1512491575, lag = 132, channels = 0x0, queries = 0x0, 
  channels_join = 0x5555555c1980 <irc_channels_join>, isnickflag = 0x5555555c8670 <isnickflag_func>, ischannel = 0x5555555c9410 <ischannel_func>, 
  get_nick_flags = 0x5555555c6db0 <get_nick_flags>, send_message = 0x5555555c90e0 <send_message>, split_message = 0x5555555c8950 <split_message>, 
  channel_find_func = 0x5555555c1740 <irc_channel_find_server>, query_find_func = 0x5555555c8490 <irc_query_find>, mask_match_func = 0x0, nick_match_msg = 0x0, redirects = 0x0, 
  redirect_queue = 0x0, redirect_next = 0x0, redirect_active = 0x0, last_nick = 0x555555d6bcc0 "+i", real_address = 0x0, usermode = 0x0, wanted_usermode = 0x0, userhost = 0x0, 
  channels_formed = 30, whois_found = 0, whowas_found = 0, emode_known = 1, no_multi_mode = 0, no_multi_who = 0, one_endofwho = 0, disable_lag = 0, nick_collision = 0, motd_got = 1, 
  isupport_sent = 1, cap_complete = 1, sasl_success = 0, max_kicks_in_cmd = 1, max_modes_in_cmd = 20, max_whois_in_cmd = 4, max_msgs_in_cmd = 20, cap_supported = 0x0, 
  cap_active = 0x0, cap_queue = 0x0, sasl_buffer = 0x0, sasl_timeout = 0, cmdcount = 57070, cmdqueue = 0x0, wait_cmd = {tv_sec = 0, tv_usec = 346171}, last_cmd = {
    tv_sec = 1512491613, tv_usec = 92250}, max_cmds_at_once = 5, cmd_queue_speed = 0, max_query_chans = 1, idles = 0x0, ctcpqueue = 0x0, knockoutlist = 0x0, splits = 0x0, 
  split_servers = 0x0, rejoin_channels = 0x0, chanqueries = 0x0, isupport = 0x0, modes = {{func = 0x0, prefix = 0 '\000'} <repeats 65 times>, {func = 0x5555555cee90 <modes_type_d>, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x0, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555ce3f0 <modes_type_a>, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x0, 
      prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555cebb0 <modes_type_prefix>, 
      prefix = 33 '!'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555ce3f0 <modes_type_a>, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555ce3f0 <modes_type_a>, prefix = 0 '\000'}, {func = 0x0, 
      prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555cebb0 <modes_type_prefix>, prefix = 37 '%'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {
      func = 0x0, prefix = 0 '\000'}, {func = 0x5555555ce950 <modes_type_b>, prefix = 0 '\000'}, {func = 0x5555555ce8d0 <modes_type_c>, prefix = 0 '\000'}, {
      func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x5555555cebb0 <modes_type_prefix>, 
      prefix = 64 '@'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {
      func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, 
    {func = 0x5555555cebb0 <modes_type_prefix>, prefix = 43 '+'}, {func = 0x5555555ce3f0 <modes_type_a>, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'}, {func = 0x0, 
      prefix = 0 '\000'}, {func = 0x5555555cee90 <modes_type_d>, prefix = 0 '\000'}, {func = 0x0, prefix = 0 '\000'} <repeats 133 times>}, 
  prefix = '\000' <repeats 33 times>, "Y\000\000\000h\000\000\000\000\000v", '\000' <repeats 20 times>, "o", '\000' <repeats 190 times>, 
  nick_comp_func = 0x5555555c7ec0 <irc_nickcmp_rfc1459>}

you can also try to show us a bt full

(gdb) bt full
#0  flood_newmsg (server=0x555555b96ae0, level=4, nick=0x5555559b6c81 "le_lookup: assertion 'hash_table != NULL' failed", 
    host=0x5555559b6c8c "assertion 'hash_table != NULL' failed", target=0x555556b0d0c0 "#test") at flood.c:196
        mserver = <optimized out>
        flood = <optimized out>
        rec = <optimized out>
        now = <optimized out>
        ttime = <optimized out>
        times = <optimized out>
        tnext = <optimized out>
        __func__ = "flood_newmsg"
#1  0x00005555555e053f in flood_privmsg (server=0x555555b96ae0, data=<optimized out>, nick=0x5555559b6c81 "le_lookup: assertion 'hash_table != NULL' failed", 
    addr=0x5555559b6c8c "assertion 'hash_table != NULL' failed") at flood.c:255
        params = 0x555556b0d0c0 "#test"
        target = 0x555556b0d0c0 "#test"
        text = 0x555556b0d0cc "no one will see this"
        level = 4
        nick = 0x5555559b6c81 "le_lookup: assertion 'hash_table != NULL' failed"
        server = 0x555555b96ae0
        data = <optimized out>
        addr = 0x5555559b6c8c "assertion 'hash_table != NULL' failed"
#2  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x555555870350, params=params@entry=4, va=va@entry=0x7fffffffe1f0, first_hook=<optimized out>) at signals.c:242
        arglist = {0x555555b96ae0, 0x555555d7c0ce, 0x5555559b6c81, 0x5555559b6c8c, 0x0, 0x0}
        prev_emitted_signal = 0x5555558743d0
        hook = 0x555555877000
        prev_emitted_hook = 0x55555586e470
        i = <optimized out>
        stopped = 0
        stop_emit_count = 0
        continue_emit_count = 0
        __func__ = "signal_emit_real"
#3  0x00005555555fa60d in signal_emit (signal=<optimized out>, params=params@entry=4) at signals.c:286
        rec = 0x555555870350
        va = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = 0x7fffffffe2e0, reg_save_area = 0x7fffffffe210}}
        signal_id = <optimized out>
        __func__ = "signal_emit"
#4  0x00005555555c0858 in irc_server_event (server=0x555555b96ae0, line=0x5555559b6cb5 "[", nick=0x5555559b6c81 "le_lookup: assertion 'hash_table != NULL' failed", 
    address=0x5555559b6c8c "assertion 'hash_table != NULL' failed") at irc.c:308
        signal = <optimized out>
        event = 0x555555d7c0c0 "event privmsg"
        args = <optimized out>
        __func__ = "irc_server_event"
#5  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x5555558743d0, params=params@entry=4, va=va@entry=0x7fffffffe3b0, first_hook=<optimized out>) at signals.c:242
        arglist = {0x555555b96ae0, 0x5555559b6cb5, 0x5555559b6c81, 0x5555559b6c8c, 0x0, 0x0}
        prev_emitted_signal = 0x55555586ed00
        hook = 0x55555586e470
        prev_emitted_hook = 0x55555586ed30
        i = <optimized out>
        stopped = 0
        stop_emit_count = 0
        continue_emit_count = 0
        __func__ = "signal_emit_real"
#6  0x00005555555fa6e5 in signal_emit_id (signal_id=<optimized out>, params=4) at signals.c:304
        rec = 0x5555558743d0
        va = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = 0x7fffffffe4a0, reg_save_area = 0x7fffffffe3d0}}
        __func__ = "signal_emit_id"
#7  0x00005555555fa0b6 in signal_emit_real (rec=rec@entry=0x55555586ed00, params=params@entry=2, va=va@entry=0x7fffffffe520, first_hook=<optimized out>) at signals.c:242
        arglist = {0x555555b96ae0, 0x5555559b6c80, 0x0, 0x0, 0x0, 0x0}
        prev_emitted_signal = 0x0
        hook = 0x55555586ed30
        prev_emitted_hook = 0x0
        i = <optimized out>
        stopped = 0
        stop_emit_count = 0
        continue_emit_count = 0
        __func__ = "signal_emit_real"
#8  0x00005555555fa6e5 in signal_emit_id (signal_id=<optimized out>, params=params@entry=2) at signals.c:304
        rec = 0x55555586ed00
        va = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 0x7fffffffe610, reg_save_area = 0x7fffffffe540}}
        __func__ = "signal_emit_id"
#9  0x00005555555c09b5 in irc_parse_incoming (server=0x555555b96ae0) at irc.c:383
        str = 0x5555559b6c80 "ble_lookup: assertion 'hash_table != NULL' failed"
        count = 0
        ret = 0
        server = 0x555555b96ae0
        count = 0
        ret = 0
#10 0x00005555555eb84d in irssi_io_invoke (source=<optimized out>, condition=<optimized out>, data=<optimized out>) at misc.c:55
        rec = <optimized out>
        icond = <optimized out>
#11 0x00007ffff6ba8dd5 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#12 0x00007ffff6ba91a0 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#13 0x00007ffff6ba922c in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
No symbol table info available.
#14 0x000055555557701f in main (argc=<optimized out>, argv=<optimized out>) at irssi.c:335
        version = 0
        options = {{long_name = 0x55555560ef4e "version", short_name = 118 'v', flags = 0, arg = G_OPTION_ARG_NONE, arg_data = 0x555555854a60 <version>, 
            description = 0x55555560c515 "Display irssi version", arg_description = 0x0}, {long_name = 0x0, short_name = 0 '\000', flags = 0, arg = G_OPTION_ARG_NONE, 
            arg_data = 0x0, description = 0x0, arg_description = 0x0}}
        loglev = 5

spaghetti2514 avatar Dec 05 '17 16:12 spaghetti2514

Oh, here's a new development.

I only crash when I send the text to a channel that has other people in it. What seems to be happening is that while the text is flooding into the channel and my client is busy sending it, some other client in the channel can't keep up and crashes, and the message from the server about their disconnect gets sent to me and my bible flood immediately stops. The next message to me then segfaults.

If I do this in an empty channel or with other users I know won't say anything or crash for the duration of the flood, I do not crash and the entire text is sent.

spaghetti2514 avatar Dec 05 '17 16:12 spaghetti2514

I can't really track it down even with all the information you provided. are you comfortable with compiling irssi? you could apply this patch https://github.com/ailin-nemui/irssi/commit/4e56675d2fea8aed59d91875f317ec228c94da85.patch and record a signal emission trace like so: irssi 2>signal_trace.log

ailin-nemui avatar Dec 07 '17 10:12 ailin-nemui

@ailin-nemui Sure, give me a few minutes

spaghetti2514 avatar Dec 07 '17 17:12 spaghetti2514

Okay so with the version of irssi I compiled with the patch, it never crashes. I have no idea why.

Also the signal_trace.log file is 645MB.

spaghetti2514 avatar Dec 07 '17 18:12 spaghetti2514

well.. thanks for trying.. meh. I wonder if you could run it under valgrind memory checker (without the signal trace patch); but maybe it's also be too slow to make it crash. or with some other methods @josephbisch do you have some suggestion how to best triage this issue?

ailin-nemui avatar Dec 08 '17 10:12 ailin-nemui

I second @ailin-nemui suggestion about trying to capture network traffic if possible with Wireshark to see what the server is sending to the crashing client.

Then we might be able to create a "fake" server that sends the same commands with netcat to avoid needing a real server and client(s) setup to reproduce the issue on our end (hopefully).

josephbisch avatar Dec 08 '17 11:12 josephbisch

f wireshark is too hard, can you try to /connect -rawlog file and if we're lucky it captures/writes enough before it crashes

ailin-nemui avatar Dec 08 '17 11:12 ailin-nemui

So irssi doesn't crash either with the signal tracing patch or when running under valgrind. I guess both make it too slow?

I don't think wireshark is feasible. The server I run irssi on has many users causing a lot of network traffic (including other irc clients). It's not so much traffic that I think it could have anything to do with the crashes, but it's enough that it would probably be a pain to extract anything from the log unless wireshark has some really good filtering tools that I'm not aware of.

can you try to /connect -rawlog file

Sure. What does that do exactly? I just tried using it and it didn't seem to create any file.

spaghetti2514 avatar Dec 12 '17 00:12 spaghetti2514

For the sake of reproduction, I'll mention again that this only happens if the ircd sends a message to your bible-sending client while it's in the middle of sending the bible. If you guys were trying to reproduce it in an empty channel, it probably won't work. Having a bot in the same channel that says something arbitrary every second or so should work, as should having a second client connected where you just manually send some messages while the first client is sending the bible.

You guys can probably figure out what's wrong a lot more easily than I can, if only it could be reproduced.

spaghetti2514 avatar Dec 12 '17 00:12 spaghetti2514

Are you sure that /connect -rawlog file doesn't create a file named file in the same directory as you run irssi from? It should record the raw IRC messages being sent.

I think that if we can get the rawlog from you, we can reproduce it without needing to try to get the same conditions with respect to bots or other clients that you have.

Ultimately the issue here is that irssi is crashing in response to something being received from the server, even if that message from the server is being sent to the crashing irssi as a result of a message from some other client. So we really need the messages between the server and the crashing client and then we should be able to reproduce the bug.

josephbisch avatar Dec 12 '17 00:12 josephbisch

I can't reproduce the bug, even with the instructions in your latest comment. I think it is more complex than just having a bot or second client sending messages at the same time as the bible is being sent.

josephbisch avatar Dec 12 '17 02:12 josephbisch

by the way @spaghetti2514 what exactly did you do when you say disable flood protection?

ailin-nemui avatar Dec 12 '17 02:12 ailin-nemui

what exactly did you do when you say disable flood protection?

I set cmd_queue_speed to 0

spaghetti2514 avatar Dec 12 '17 04:12 spaghetti2514

I got rawlog to work (I was giving it a path instead of just a filename because I wanted the file somewhere else). The results aren't illuminating. It's just a bunch of outgoing PRIVMSGs from me to the server containing bible passages followed by the single message that kills me, whether that's a message from a bot or whatever. There's nothing after it.

Example:

...
<< PRIVMSG #test :13:27 So the servants of the householder came and said unto him, Sir,
<< PRIVMSG #test :didst not thou sow good seed in thy field? from whence then hath it
>> :[email protected] PRIVMSG #test :aaa

spaghetti2514 avatar Dec 12 '17 04:12 spaghetti2514

Maybe you guys can't reproduce it because you can't fill up some kind of buffer fast enough? If disk IO is bottlenecking you there's a kernel module that you can compile and load to get a character device that you can read the bible straight out of.

spaghetti2514 avatar Dec 14 '17 02:12 spaghetti2514

I don't think disk IO should be a bottle neck here. You said on one config it didn't crash but only disconnect, did you have a chance to investigate the config difference?

ailin-nemui avatar Dec 14 '17 03:12 ailin-nemui

Okay, I finally figured out what the profile difference is. If nickcolor.pl is present in scripts/autorun, irssi segfaults, otherwise it doesn't.

spaghetti2514 avatar Dec 15 '17 23:12 spaghetti2514

Would it be possible to get a copy of your ~/.irssi directory (including nickcolor.pl) that you use to cause the crash please?

As well, could you point me to a copy of the bible.txt you're using?

Would it be possible for us to test on the IRC server you're using as well? If so, could you tell us the information to connect?

Maybe we can reproduce it if we copy everything exactly!

Thanks for your persistence in helping us figure it out!

horgh avatar Dec 16 '17 02:12 horgh

Here's a zip containing the minimal .irssi directory and the bible.txt

I'm testing this on irc.oppaiti.me, which is running inspircd 2.2.0. If you're gonna join be warned that the motd is the entire bee movie script.

spaghetti2514 avatar Dec 16 '17 04:12 spaghetti2514

Ah, yes, a christian server.

I managed to reproduce the crash. Thank you!

The short version (breakpoints on server_connect_init, server->connection_lost = TRUE, glog_func, server_disconnect), pretty much tells the whole story already. Bigger log with backtraces: log.txt

0x00007f095e233f30 in _start () from /lib64/ld-linux-x86-64.so.2
(rr) c
Continuing.

Breakpoint 9, server_connect_init (server=0x558ecd4099a0) at servers.c:357
357             g_return_if_fail(server != NULL);
(rr)
Continuing.
[New Thread 29950.29980]

Thread 1 hit Breakpoint 12, glog_func (log_domain=0x0, log_level=G_LOG_LEVEL_WARNING, message=0x558ecd88a9d0
    "Dropping some data on an outgoing connection") at fe-common-core.c:253
253             switch (log_level) {
(rr)
Continuing.

Thread 1 hit Breakpoint 11, irc_server_send_data (server=0x558ecd4099a0, data=<optimized out>, len=94) at irc-servers.c:529
529                     server->connection_lost = TRUE;
(rr)
Continuing.

Thread 1 hit Breakpoint 4, server_disconnect (server=0x558ecd4099a0) at servers.c:462
462     {
(rr)
Continuing.

Thread 1 hit Breakpoint 12, glog_func (log_domain=0x7f095d55efc5 "GLib", log_level=G_LOG_LEVEL_CRITICAL, message=0x558ecd5876b0
    "g_hash_table_lookup: assertion 'hash_table != NULL' failed") at fe-common-core.c:253
253             switch (log_level) {
(rr)
Continuing.

Thread 1 hit Breakpoint 12, glog_func (log_domain=0x7f095d55efc5 "GLib", log_level=G_LOG_LEVEL_CRITICAL, message=0x558ecd9f0200
    "g_hash_table_lookup: assertion 'hash_table != NULL' failed") at fe-common-core.c:253
253             switch (log_level) {
(rr)
Continuing.

[...]

Thread 1 received signal SIGSEGV, Segmentation fault.
flood_newmsg (server=0x558ecd4099a0, level=4, nick=0x558ecd40acd1 "|\237U",
    host=0x558ecd40acd9 "ble_lookup: assertion 'hash_table != NULL' failed",
    target=0x558ecd594720 "#christianity") at flood.c:196
196             flood = g_hash_table_lookup(mserver->floodlist, nick);
(rr)

The server->connection_lost = TRUE happens a few stack frames above the same path as the Dropping some data on an outgoing connection

The incoming message sent from the second irssi is handled by nickcolor.pl's sig_public which does $server->command(...), which calls parse_command which notices server->connection_lost and disconnects.

server_disconnect doesn't kill it yet, but it emits the "server disconnected" signal that clears stuff like server->isupport which results in those NULL hash table messages and calls flood_deinit_server which removes the module data of that flood.

So when it gets to flood_newmsg it does mserver = MODULE_DATA(server);, fails to find it and so it returns NULL, and mserver->floodlist crashes.

I think there's no use after free involved here, which is pretty cool. By the time it crashes the refcount of the server is 1, everything is good there. It's just a handful of things getting called after the irc part of the server is full of NULLs.

dequis avatar Dec 16 '17 06:12 dequis