atop Segmentation fault

Atop seems to exit after a SIGSEGV when running for a while (ranges from tens of minutes to hours). I switch between s, d and n views regularly.

Versions: Atop built from repo (fa4db43) netatop 2.0 built from source tgz Ubuntu kernel 4.18.0-15-generic

Stacktrace:

#0  _int_malloc (av=av@entry=0x7ffff71bfc40 <main_arena>, bytes=bytes@entry=0) at malloc.c:4028
#1  0x00007ffff6e6b0fc in __GI___libc_malloc (bytes=0) at malloc.c:3057
#2  0x000055555559087e in netatop_exitstore () at netatopif.c:350
#3  0x000055555555fe5c in engine () at atop.c:945
#4  0x000055555555f8e6 in main (argc=1, argv=0x7fffffffebb8) at atop.c:704

Feb 27 '19 06:02 GertBurger

I believe my commit b54801ca81c06b073126f7f941db1d922a2f1e1c probably fixed this issue. Please try building again with this commit and see if that resolves it.

Jun 28 '19 20:06 gleventhal

I tested that commit and b55f28a , issue remains although it took much longer to trigger. I am not sure if that is relevant though.

Jul 04 '19 08:07 GertBurger

You are probably hitting a different condition. I believe there are other unchecked readdir and lseek() all over the place. Do you have the stacktrace?

Jul 04 '19 14:07 gleventhal

Or even better, a coredump?

Jul 04 '19 14:07 gleventhal

Does it happen with netatop disabled?

Jul 04 '19 14:07 gleventhal

The stacktrace is the same:

(gdb) bt
#0  _int_malloc (av=av@entry=0x7ffff71bfc40 <main_arena>, bytes=bytes@entry=0) at malloc.c:4028
#1  0x00007ffff6e6b0fc in __GI___libc_malloc (bytes=0) at malloc.c:3057
#2  0x00005555555908a2 in netatop_exitstore () at netatopif.c:350
#3  0x000055555555febc in engine () at atop.c:945
#4  0x000055555555f946 in main (argc=1, argv=0x7fffffffebb8) at atop.c:704

I would rather not provide a core dump as it will likely contain sensitive information. What are you looking to extract from it?

This only happens with netatop is in use, i.e. daemon is running and using the network view.

Jul 05 '19 09:07 GertBurger