mrouted icon indicating copy to clipboard operation
mrouted copied to clipboard

mrouted segfault after several hours

Open Lnx4F opened this issue 3 years ago • 3 comments

Hello! Currently have a setup using mrouted 4.4 on 2 interfaces on a brand new installation of Debian 11 CLI. Downloaded the 4.4 client from deb.troglobit.com.

No modifications to mrouted.conf, just launching the daemon and letting it do it's thing for the 2 interfaces.

Please spare my lack of knowledge as I am neither a Linux pro, or a developer. :(

My 2 interfaces are:

  1. 10.11.12.0/24
  2. 192.168.1.0/24

I can see that the daemon crashes and pulled the syslog at that moment:

May 24 18:16:44 mrtr01 mrouted[625]: warning - Failed MRT_DEL_MFC(169.254.153.104 239.255.255.250): No such file or directory
May 24 18:16:44 mrtr01 mrouted[625]: warning - age_table_entry() trying to delete no-route (169.254.153.104 239.255.255.250): No such file or directory
May 25 05:03:56 mrtr01 kernel: [39776.168675] mrouted[625]: segfault at 21 ip 000055cd86b1fc61 sp 00007fff879362a0 error 6 in mrouted[55cd86b0e000+17000]
May 25 05:03:56 mrtr01 kernel: [39776.168727] Code: 30 85 ff 7f 54 8b 7b 38 85 ff 7f 3d 8b 73 10 89 ef e8 23 8e ff ff 48 8b 03 48 8b 53 08 48 85 c0 74 4f 48 89 50 08 48 8b 53 08 <48> 89 02 48 89 df e8 d4 e3>
May 25 05:03:56 mrtr01 systemd[1]: mrouted.service: Main process exited, code=killed, status=11/SEGV
May 25 05:03:56 mrtr01 systemd[1]: mrouted.service: Failed with result 'signal'.
May 25 05:03:56 mrtr01 systemd[1]: mrouted.service: Consumed 4.906s CPU time.

Restarting the daemon will allow it to run for another few hours, but will crash again. I previously had mrouted running on an Ubuntu VM, and the same issue happened. I am definitely suspecting something strange on MY network affecting the daemon. Perhaps the strange APIPA route del requests are doing something to the daemon?

My linux knowledge is very minimal, but if I'm given the steps to do something I will try and get it done for you. I am willing to debug or do whatever is needed. My goal is simply to have a stable daemon that I do not need to restart every so often. Appreciate any help that can be given!

Lnx4F avatar May 25 '22 15:05 Lnx4F

Hmm, yeah that's not right, it shouldn't segfault obviously. Regardless of your network setup.

The failure to delete the APIPA route could be related, but there are quite a few hours in between that log entry and the segfault.

My recommendation is to rebuild with GDB debug flags, start mrouted manually (or run make install instead of make install-strip) so we retain the debug symbols. When it eventually crashes, you launch coredumpctl to 1) list, and then 2) debug the crash. Like this:

make clean
./configure CFLAGS="-g -Og"  your-other-configure-flags-here-like-prefix-etc
make -j9
sudo make install

Start it ... wait for crash

coredumpctl debug
> bt full

The bt full command (in GDB, which you need to have installed) shows a detailed bactrace that you can attach to this issue. Thanks!

troglobit avatar May 29 '22 15:05 troglobit

I've set up a long-term test in my home network to see if I can reproduce, but it would be real helpful if you could get the backtrace (bt) from your setup.

troglobit avatar Jun 19 '22 15:06 troglobit

Hi again, unfortunately I've not been able to reproduce your crash in my (limited) setup at home. I'll let this issue remain open for other ppl to chime in as well.

troglobit avatar Jul 15 '22 05:07 troglobit