audit-userspace icon indicating copy to clipboard operation
audit-userspace copied to clipboard

auditd segfaults on mvebu (Marvell Armada) on OpenWrt

Open M95D opened this issue 1 year ago • 11 comments

Hi.

I build and use my own OpenWrt images with audit enabled for 3 platforms:

  • Raspberry Pi 3B+
  • Linksys WRT-1900AC v1 (Marvell Armada XP)
  • Turris Omnia (Marvell Armada 385)

The 3 configurations that I built are almost identical. Drivers built in kernel differ, but OpenWrt config is mostly the same for all 3.

On RPi, audit works properly. On mvebu (both of them), kernel audit works, but auditd daemon segfaults on start.

I tried:

  • to make the package built-in, so there woudn't be any conflicts between kernel, libraries and auditd - segfaults too
  • to install auditd from the official OpenWrt package - segfaults too.
  • to start without any audit rules or config files - segfaults too.
  • to start manually and with various cmdline switches - still segfaults.

I have this situation for more than a year. At first I thought that there must be something wrong with my build. But after several images built, kernel major version changed, numerous cleanups, host gcc major version change (build host is Gentoo), now I'm sure it must be a problem with auditd itself, or possibly something about the way OpenWrt builds it on mvebu platforms.

What more can I do to help solve this problem? Thanks.

PS: Sorry for the edit. I hit enter by mistake before the text was written.

M95D avatar Mar 29 '24 21:03 M95D

Ah... The version is 4y old.

M95D avatar Mar 29 '24 21:03 M95D

Do you have any debug information like a stack trace that shows where it is segfaulting? Do the logs mention where it is segfaulting? Does auditd itself mention something is not right? If you run it in gdb as root, does it segfault? If you run it with valgrind, does it segfault? Any stack trace would be helpful. While it may be a 4 year old release, it is worth digging into to see if the code has been fixed or not. (BTW, I have seen several segfaults on fedora that were in fact glibc's bug.) And...what version are you using?

stevegrubb avatar Mar 30 '24 14:03 stevegrubb

Thanks for reply.

The package is old. It doesn't make sense to debug that version. I'm currently updating it v3.0.7 and I'll report back if it still doesn't work.

You are probably going to ask why not update to latest version: It's because since v3.0.8, the build fails. I'll open another bug report for that, after I'm finished with this version.

M95D avatar Mar 30 '24 22:03 M95D

I tried v3.0.7 and it still segfaults.

Normal run:

# auditd -f -c /tmp
Config file /tmp/auditd.conf doesn't exist, skipping
No plugins found, not dispatching events
Segmentation fault

strace:

strace -f auditd -f -c /tmp
execve("/usr/sbin/auditd", ["auditd", "-f", "-c", "/tmp"], 0xbed2fd50 /* 18 vars */) = 0
set_tls(0xb6f5f4dc)                     = 0
set_tid_address(0xb6f5e7d4)             = 8237
brk(NULL)                               = 0x42c000
brk(0x42e000)                           = 0x42e000
mmap2(0x42c000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x42c000
open("/etc/ld-musl-armhf.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib/libaudit.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/libaudit.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib/libaudit.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=132308, ...}) = 0
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0`?\1\0004\0\0\0"..., 936) = 936
mmap2(NULL, 212992, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6eae000
mmap2(0xb6ec1000, 61440, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0x12000) = 0xb6ec1000
mmap2(0xb6ed0000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x20000) = 0xb6ed0000
mmap2(0xb6ed1000, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x20000) = 0xb6ed1000
mmap2(0xb6ed2000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6ed2000
close(3)                                = 0
open("/lib/libauparse.so.0", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/local/lib/libauparse.so.0", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib/libauparse.so.0", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=169884, ...}) = 0
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0 \276\0\0004\0\0\0"..., 936) = 936
mmap2(NULL, 204800, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6e7c000
mmap2(0xb6e87000, 131072, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0xa000) = 0xb6e87000
mmap2(0xb6ea7000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x29000) = 0xb6ea7000
mmap2(0xb6ea8000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x29000) = 0xb6ea8000
mmap2(0xb6ea9000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6ea9000
close(3)                                = 0
open("/lib/libgcc_s.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=45060, ...}) = 0
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\0\0\0\0004\0\0\0"..., 936) = 936
mmap2(NULL, 49152, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xb6e70000
mmap2(0xb6e7a000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xa000) = 0xb6e7a000
close(3)                                = 0
mprotect(0xb6ed0000, 4096, PROT_READ)   = 0
mprotect(0xb6ea7000, 4096, PROT_READ)   = 0
mprotect(0xb6e7a000, 4096, PROT_READ)   = 0
mprotect(0x427000, 8192, PROT_READ)     = 0
open("/proc/self/sessionid", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = 3
read(3, "4294967295", 16)               = 10
close(3)                                = 0
geteuid32()                             = 0
rt_sigaction(SIGHUP, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGINT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGQUIT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGTRAP, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[], [], 8)   = 0
rt_sigaction(SIGABRT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGFPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGKILL, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = -1 EINVAL (Invalid argument)
rt_sigaction(SIGUSR1, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGSEGV, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGUSR2, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGALRM, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGSTKFLT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGCHLD, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGCONT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGSTOP, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = -1 EINVAL (Invalid argument)
rt_sigaction(SIGTSTP, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGTTIN, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGTTOU, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGURG, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGXCPU, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGXFSZ, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGVTALRM, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGPROF, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGWINCH, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGIO, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGPWR, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGSYS, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_3, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_4, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_5, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_6, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_7, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_8, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_9, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_10, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_11, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_12, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_13, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_14, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_15, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_16, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_17, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_18, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_19, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_20, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_21, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_22, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_23, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_24, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_25, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_26, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_27, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_28, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_29, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_30, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_31, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigaction(SIGRT_32, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
rt_sigaction(SIGCHLD, {sa_handler=0x408aac, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xb6f2c084}, NULL, 8) = 0
prlimit64(0, RLIMIT_FSIZE, {rlim_cur=RLIM64_INFINITY, rlim_max=RLIM64_INFINITY}, NULL) = 0
prlimit64(0, RLIMIT_CPU, {rlim_cur=RLIM64_INFINITY, rlim_max=RLIM64_INFINITY}, NULL) = 0
open("/tmp/auditd.conf", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 ENOENT (No such file or directory)
writev(2, [{iov_base="Config file /tmp/auditd.conf doe"..., iov_len=52}, {iov_base=NULL, iov_len=0}], 2Config file /tmp/auditd.conf doesn't exist, skipping) = 52
writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2
) = 1
getpriority(PRIO_PROCESS, 0)            = 20
setpriority(PRIO_PROCESS, 0, -4)        = 0
socket(AF_NETLINK, SOCK_RAW, NETLINK_AUDIT) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getpid()                                = 8237
open("/var/run/auditd.pid", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE|O_NOFOLLOW, 0644) = 4
write(4, "8237\n", 5)                   = 5
close(4)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6e6f000
fcntl64(1, F_GETFL)                     = 0x20402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
ioctl(1, TIOCGWINSZ, {ws_row=79, ws_col=316, ws_xpixel=1915, ws_ypixel=1031}) = 0
mmap2(NULL, 28672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6e68000
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
membarrier(MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, 0) = 0
mmap2(NULL, 143360, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6e45000
mprotect(0xb6e47000, 135168, PROT_READ|PROT_WRITE) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0
clone(child_stack=0xb6e67d70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|0x400000strace: Process 8238 attached
, parent_tid=[8238], tls=0xb6e67df4, child_tidptr=0xb6f5e7d4) = 8238
[pid  8237] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  8238] rt_sigprocmask(SIG_SETMASK, [],  <unfinished ...>
[pid  8237] clock_gettime64(CLOCK_MONOTONIC, {tv_sec=7225, tv_nsec=291333852}) = 0
[pid  8238] <... rt_sigprocmask resumed>NULL, 8) = 0
[pid  8237] clock_gettime64(CLOCK_REALTIME, {tv_sec=1711909269, tv_nsec=577874186}) = 0
[pid  8238] rt_sigprocmask(SIG_SETMASK, [HUP USR1 USR2 TERM CHLD CONT],  <unfinished ...>
[pid  8237] clock_gettime64(CLOCK_MONOTONIC, {tv_sec=7225, tv_nsec=292086101}) = 0
[pid  8238] <... rt_sigprocmask resumed>NULL, 8) = 0
[pid  8237] open("/etc/audit/plugins.d", O_RDONLY|O_LARGEFILE|O_CLOEXEC|O_DIRECTORY <unfinished ...>
[pid  8238] futex(0xb6e67c94, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  8237] <... open resumed>)         = 4
[pid  8237] fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
[pid  8237] mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6e43000
[pid  8237] getdents64(4, 0xb6e43058 /* 5 entries */, 2048) = 152
[pid  8237] open("/etc/audit/plugins.d/af_unix.conf", O_RDONLY|O_LARGEFILE) = 5
[pid  8237] statx(5, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0640, stx_size=358, ...}) = 0
[pid  8237] read(5, "\n# This file controls the config"..., 1024) = 358
[pid  8237] read(5, "", 1024)           = 0
[pid  8237] close(5)                    = 0
[pid  8237] open("/etc/audit/plugins.d/au-remote.conf", O_RDONLY|O_LARGEFILE) = 5
[pid  8237] statx(5, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0640, stx_size=238, ...}) = 0
[pid  8237] read(5, "\n# This file controls the audisp"..., 1024) = 238
[pid  8237] read(5, "", 1024)           = 0
[pid  8237] close(5)                    = 0
[pid  8237] open("/etc/audit/plugins.d/syslog.conf", O_RDONLY|O_LARGEFILE) = 5
[pid  8237] statx(5, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0640, stx_size=521, ...}) = 0
[pid  8237] read(5, "# This file controls the configu"..., 1024) = 521
[pid  8237] read(5, "", 1024)           = 0
[pid  8237] close(5)                    = 0
[pid  8237] getdents64(4, 0xb6e43058 /* 0 entries */, 2048) = 0
[pid  8237] close(4)                    = 0
[pid  8237] munmap(0xb6e43000, 8192)    = 0
[pid  8237] writev(2, [{iov_base="No plugins found, not dispatchin"..., iov_len=40}, {iov_base=NULL, iov_len=0}], 2No plugins found, not dispatching events) = 40
[pid  8237] writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2
) = 1
[pid  8237] socketpair(AF_UNIX, SOCK_STREAM, 0, [4, 5]) = 0
[pid  8237] fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
[pid  8237] fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
[pid  8237] uname({sysname="Linux", nodename="GRAPHRT", ...}) = 0
[pid  8237] getpid()                    = 8237
[pid  8237] open("/proc/8237/attr/current", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[pid  8237] open("/proc/self/loginuid", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = 6
[pid  8237] read(6, "4294967295", 16)   = 10
[pid  8237] close(6)                    = 0
[pid  8237] getpid()                    = 8237
[pid  8237] getuid32()                  = 0
[pid  8237] clock_gettime64(CLOCK_REALTIME, {tv_sec=1711909269, tv_nsec=587973438}) = 0
[pid  8237] clock_gettime64(CLOCK_REALTIME, {tv_sec=1711909269, tv_nsec=588203591}) = 0
[pid  8237] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x24c} ---
[pid  8238] <... futex resumed>)        = ?
[pid  8238] +++ killed by SIGSEGV +++
+++ killed by SIGSEGV +++
Segmentation fault

I built the kernel with Verbose user fault messages (CONFIG_USER_DEBUG) and it showed this:

<3>[  142.662503][ T5739] 8<--- cut here ---
<3>[  142.666354][ T5739] auditd: unhandled page fault (11) at 0x00000019, code 0x005
<3>[  142.673722][ T5739] [00000019] *pgd=00000000
<5>[  142.678026][ T5739] CPU: 0 PID: 5739 Comm: auditd Tainted: G           O       6.1.82 #0
<5>[  142.686167][ T5739] Hardware name: Marvell Armada 380/385 (Device Tree)
<5>[  142.692837][ T5739] PC is at 0xb6f4f5b0
<5>[  142.696694][ T5739] LR is at 0xb6f50c44
<5>[  142.700540][ T5739] pc : [<b6f4f5b0>]    lr : [<b6f50c44>]    psr: 20000010
<5>[  142.707542][ T5739] sp : bed11130  ip : 0000003a  fp : 00000000
<5>[  142.713482][ T5739] r10: bed11540  r9 : 004ab3c5  r8 : 00000000
<5>[  142.719415][ T5739] r7 : 00000000  r6 : 00000019  r5 : 00000019  r4 : 7fffffff
<5>[  142.726740][ T5739] r3 : 00000019  r2 : 7fffffff  r1 : 00000000  r0 : 00000019
<5>[  142.733992][ T5739] Flags: nzCv  IRQs on  FIQs on  Mode USER_32  ISA ARM Segment user
<5>[  142.741934][ T5739] Control: 10c5387d  Table: 0a4ac04a  DAC: 00000055
<3>[  147.749776][ T5766] 8<--- cut here ---
<3>[  147.753546][ T5766] auditd: unhandled page fault (11) at 0x00000074, code 0x005
<3>[  147.760895][ T5766] [00000074] *pgd=00000000
<5>[  147.765180][ T5766] CPU: 0 PID: 5766 Comm: auditd Tainted: G           O       6.1.82 #0
<5>[  147.773300][ T5766] Hardware name: Marvell Armada 380/385 (Device Tree)
<5>[  147.779955][ T5766] PC is at 0xb6efa5c0
<5>[  147.783802][ T5766] LR is at 0xb6efbc44
<5>[  147.787662][ T5766] pc : [<b6efa5c0>]    lr : [<b6efbc44>]    psr: 20000010
<5>[  147.794647][ T5766] sp : bec20120  ip : 0000003a  fp : 00000000
<5>[  147.800592][ T5766] r10: bec20540  r9 : 0046b3c5  r8 : 00000000
<5>[  147.806526][ T5766] r7 : 00000000  r6 : 00000074  r5 : 00000074  r4 : 7fffffff
<5>[  147.813771][ T5766] r3 : 00000032  r2 : 7fffffff  r1 : 00000000  r0 : 00000074
<5>[  147.821029][ T5766] Flags: nzCv  IRQs on  FIQs on  Mode USER_32  ISA ARM Segment user
<5>[  147.828966][ T5766] Control: 10c5387d  Table: 0a48c04a  DAC: 00000055

I didn't use gdb before. I have no experience using it. Tried this, according to some quick google tutorial: # gdb --args auditd -n -c /tmp and press "r":

(gdb) r
Starting program: /usr/sbin/auditd -f -c /tmp
Config file /tmp/auditd.conf doesn't exist, skipping
[New LWP 8509]
No plugins found, not dispatching events

Thread 1 "auditd" received signal SIGSEGV, Segmentation fault.
0xb6fd75b0 in ?? ()

I wish to help debug, but never done this before. I need specific instructions.

M95D avatar Mar 31 '24 18:03 M95D

When it segfaults, type "bt" for a backtrace. It works better if auditd is not stripped.

stevegrubb avatar Mar 31 '24 21:03 stevegrubb

(gdb) r
Starting program: /usr/sbin/auditd -f -c /tmp
Config file /tmp/auditd.conf doesn't exist, skipping
[New LWP 16076]
No plugins found, not dispatching events

Thread 1 "auditd" received signal SIGSEGV, Segmentation fault.
0xb6fd75b0 in ?? ()
(gdb) bt
#0  0xb6fd75b0 in ?? ()
#1  0xb6fd8c44 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I'll try to build it without stripping. It might take a while.

M95D avatar Mar 31 '24 22:03 M95D

Yes, this looks stripped since it has double question marks. The strace kind of shows hints at where it is. ltrace would probably get closer since it's at the function level of the libraries rather than the syscall interface. Valgrind also does a decent job making a backtrace.

stevegrubb avatar Apr 01 '24 03:04 stevegrubb

I built it without stripping, but gdb still doesn't show anything different:

root@GRAPHRT:/mnt/Work/Share/OpenWrt# file /usr/sbin/auditd
/usr/sbin/auditd: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-armhf.so.1, with debug_info, not stripped
root@GRAPHRT:/mnt/Work/Share/OpenWrt# gdb --args auditd -f -c /tmp
GNU gdb (GDB) 14.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-openwrt-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from auditd...
(gdb) r
Starting program: /usr/sbin/auditd -f -c /tmp
Config file /tmp/auditd.conf doesn't exist, skipping
[New LWP 3223]
No plugins found, not dispatching events

Thread 1 "auditd" received signal SIGSEGV, Segmentation fault.
0xb6fd75c0 in ?? ()
(gdb) bt
#0  0xb6fd75c0 in ?? ()
#1  0xb6fd8c44 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

Valgrind:

# valgrind /usr/sbin/auditd -f -c /tmp
==3541== Memcheck, a memory error detector
==3541== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==3541== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==3541== Command: /usr/sbin/auditd -f -c /tmp
==3541==
==3541== Warning: ignored attempt to set SIGKILL handler in sigaction();
==3541==          the SIGKILL signal is uncatchable
==3541== Warning: ignored attempt to set SIGSTOP handler in sigaction();
==3541==          the SIGSTOP signal is uncatchable
Config file /tmp/auditd.conf doesn't exist, skipping
No plugins found, not dispatching events
==3541== Invalid read of size 1
==3541==    at 0x40555C0: ??? (in /lib/libc.so)
==3541==  Address 0x84 is not stack'd, malloc'd or (recently) free'd
==3541==
==3541==
==3541== Process terminating with default action of signal 11 (SIGSEGV)
==3541==  Access not within mapped region at address 0x84
==3541==    at 0x40555C0: ??? (in /lib/libc.so)
==3541==  If you believe this happened as a result of a stack
==3541==  overflow in your program's main thread (unlikely but
==3541==  possible), you can try to increase the size of the
==3541==  main thread stack using the --main-stacksize= flag.
==3541==  The main thread stack size used in this run was 8388608.
==3541==
==3541== HEAP SUMMARY:
==3541==     in use at exit: 0 bytes in 0 blocks
==3541==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3541==
==3541== All heap blocks were freed -- no leaks are possible
==3541==
==3541== For lists of detected and suppressed errors, rerun with: -s
==3541== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

Now I guess I have to build the entire OpenWrt without stripping...

PS: 8388608 - isn't that the 8 MB stack limit on Linux?

M95D avatar Apr 01 '24 09:04 M95D

Another run with larger stack, while waiting for openwrt to compile:

root@GRAPHRT:/mnt/Work/Share/OpenWrt# ulimit -s
8192
root@GRAPHRT:/mnt/Work/Share/OpenWrt# ulimit -s 65536
root@GRAPHRT:/mnt/Work/Share/OpenWrt# ulimit -s
65536
root@GRAPHRT:/mnt/Work/Share/OpenWrt# valgrind /usr/sbin/auditd -f -c /tmp
==3643== Memcheck, a memory error detector
==3643== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==3643== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==3643== Command: /usr/sbin/auditd -f -c /tmp
==3643==
==3643== Warning: ignored attempt to set SIGKILL handler in sigaction();
==3643==          the SIGKILL signal is uncatchable
==3643== Warning: ignored attempt to set SIGSTOP handler in sigaction();
==3643==          the SIGSTOP signal is uncatchable
Config file /tmp/auditd.conf doesn't exist, skipping
No plugins found, not dispatching events
==3643== Invalid read of size 1
==3643==    at 0x40555B0: ??? (in /lib/libc.so)
==3643==  Address 0x197 is not stack'd, malloc'd or (recently) free'd
==3643==
==3643==
==3643== Process terminating with default action of signal 11 (SIGSEGV)
==3643==  Access not within mapped region at address 0x197
==3643==    at 0x40555B0: ??? (in /lib/libc.so)
==3643==  If you believe this happened as a result of a stack
==3643==  overflow in your program's main thread (unlikely but
==3643==  possible), you can try to increase the size of the
==3643==  main thread stack using the --main-stacksize= flag.
==3643==  The main thread stack size used in this run was 16777216.
==3643==
==3643== HEAP SUMMARY:
==3643==     in use at exit: 0 bytes in 0 blocks
==3643==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3643==
==3643== All heap blocks were freed -- no leaks are possible
==3643==
==3643== For lists of detected and suppressed errors, rerun with: -s
==3643== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

M95D avatar Apr 01 '24 09:04 M95D

In reviewing the strace, I think it is trying to send the DAEMON_START event. The call to getsubj failed because ENOENT, so it takes the other branch. After creating the event, it calls, send_audit_event(). The last thing in the trace is the calls to gettimeofday() and then segfault. It's hard to tell how far it got after that. The next step is calling distribute_event() which can do a lot before causing a syscall.

I don't know if ltrace would be helpful. But if it's not in audit's code, the likely suspect is the C library - which I guess is musl c? The invalid read of 1 is also a clue to where it is. The address 0x84 sounds like something is a NULL ptr but part of a large data structure. If you have debug symbols for the C library, that might shed light on where it is.

stevegrubb avatar Apr 01 '24 12:04 stevegrubb

The C library is musl. I built an image without stripping any package and I hope that includes musl. I can't test it before the weekend. WFH - I can't risk breaking the router.

I don't have ltrace on the router. I would skip that step if possible.

M95D avatar Apr 01 '24 20:04 M95D

OK, apparently, /lib/libc.so can be replaced while the router is running. :-) I didn't expect that. Here's the trace:

# valgrind auditd -f -c /tmp
==12521== Memcheck, a memory error detector
==12521== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==12521== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==12521== Command: auditd -f -c /tmp
==12521== 
==12521== Warning: ignored attempt to set SIGKILL handler in sigaction();
==12521==          the SIGKILL signal is uncatchable
==12521== Warning: ignored attempt to set SIGSTOP handler in sigaction();
==12521==          the SIGSTOP signal is uncatchable
Config file /tmp/auditd.conf doesn't exist, skipping
No plugins found, not dispatching events
==12521== Invalid read of size 1
==12521==    at 0x406E9E0: memchr (memchr.c:16)
==12521==    by 0x4070AAF: strnlen (strnlen.c:5)
==12521==    by 0x401E27F: printf_core (vfprintf.c:599)
==12521==    by 0x40690FB: vfprintf (vfprintf.c:688)
==12521==    by 0x406BAE7: vsnprintf (vsnprintf.c:54)
==12521==    by 0x4068143: snprintf (snprintf.c:9)
==12521==    by 0x11101F: send_audit_event (auditd.c:314)
==12521==    by 0x112B27: main (auditd.c:842)
==12521==  Address 0x1c1 is not stack'd, malloc'd or (recently) free'd
==12521== 
==12521== 
==12521== Process terminating with default action of signal 11 (SIGSEGV)
==12521==  Access not within mapped region at address 0x1C1
==12521==    at 0x406E9E0: memchr (memchr.c:16)
==12521==    by 0x4070AAF: strnlen (strnlen.c:5)
==12521==    by 0x401E27F: printf_core (vfprintf.c:599)
==12521==    by 0x40690FB: vfprintf (vfprintf.c:688)
==12521==    by 0x406BAE7: vsnprintf (vsnprintf.c:54)
==12521==    by 0x4068143: snprintf (snprintf.c:9)
==12521==    by 0x11101F: send_audit_event (auditd.c:314)
==12521==    by 0x112B27: main (auditd.c:842)
==12521==  If you believe this happened as a result of a stack
==12521==  overflow in your program's main thread (unlikely but
==12521==  possible), you can try to increase the size of the
==12521==  main thread stack using the --main-stacksize= flag.
==12521==  The main thread stack size used in this run was 8388608.
==12521== 
==12521== HEAP SUMMARY:
==12521==     in use at exit: 19,490 bytes in 8 blocks
==12521==   total heap usage: 24 allocs, 16 frees, 25,285 bytes allocated
==12521== 
==12521== LEAK SUMMARY:
==12521==    definitely lost: 0 bytes in 0 blocks
==12521==    indirectly lost: 0 bytes in 0 blocks
==12521==      possibly lost: 0 bytes in 0 blocks
==12521==    still reachable: 19,490 bytes in 8 blocks
==12521==         suppressed: 0 bytes in 0 blocks
==12521== Rerun with --leak-check=full to see details of leaked memory
==12521== 
==12521== For lists of detected and suppressed errors, rerun with: -s
==12521== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

M95D avatar Apr 02 '24 17:04 M95D

gdb:

Reading symbols from /usr/sbin/auditd...
(gdb) r
Starting program: /usr/sbin/auditd -f -c /tmp
Config file /tmp/auditd.conf doesn't exist, skipping
[New LWP 13134]
No plugins found, not dispatching events

Thread 1 "auditd" received signal SIGSEGV, Segmentation fault.
0xb6fcf9e0 in memchr (src=src@entry=0x31f, c=c@entry=0, n=n@entry=2147483647) at src/string/memchr.c:16
warning: Source file is more recent than executable.
16              for (; ((uintptr_t)s & ALIGN) && n && *s != c; s++, n--);
(gdb) bt
#0  0xb6fcf9e0 in memchr (src=src@entry=0x31f, c=c@entry=0, n=n@entry=2147483647) at src/string/memchr.c:16
#1  0xb6fd1ab0 in strnlen (s=s@entry=0x31f <error: Cannot access memory at address 0x31f>, n=2147483647) at src/string/strnlen.c:5
#2  0xb6f7f280 in printf_core (f=f@entry=0xbeffb3a0, fmt=fmt@entry=0x405bb0 "audit(%lu.%03u:%u): %s", ap=ap@entry=0xbeffb29c, nl_arg=nl_arg@entry=0xbeffb2c8, nl_type=<optimized out>, nl_type@entry=0xbeffb2a0) at src/stdio/vfprintf.c:599
#3  0xb6fca0fc in vfprintf (f=f@entry=0xbeffb3a0, fmt=fmt@entry=0x405bb0 "audit(%lu.%03u:%u): %s", ap=..., ap@entry=...) at src/stdio/vfprintf.c:688
#4  0xb6fccae8 in vsnprintf (s=<optimized out>, n=496, fmt=0x405bb0 "audit(%lu.%03u:%u): %s", fmt@entry=0x20 <error: Cannot access memory at address 0x20>, ap=..., ap@entry=...) at src/stdio/vsnprintf.c:54
#5  0xb6fc9144 in snprintf (s=<optimized out>, n=<optimized out>, fmt=0x405bb0 "audit(%lu.%03u:%u): %s") at src/stdio/snprintf.c:9
#6  0x00409020 in send_audit_event (type=1200, str=0xbeffd9a8 "op=start ver=3.0.7 format=enriched kernel=6.1.82 auid=4294967295 pid=13132 uid=0 ses=4294967295 res=success") at auditd.c:314
#7  0x0040ab28 in main (argc=4, argv=0xbefffd24) at auditd.c:842

memchr.c might not be the correct one. I took the most likely file from the build dir and put it on the router.

M95D avatar Apr 02 '24 18:04 M95D

Thanks. It looks like things are OK down to printf_core. The send_audit_event function variables all look OK. Some are optimized out. (I'd really like to have known what the value of s is for snprintf.) The code in question in printf_core looks like this:

        case 's':
                a = arg.p ? arg.p : "(null)";
                z = a + strnlen(a, p<0 ? INT_MAX : p);

So, it's processing the %s of the audit event format. But going back to the snprintf to see what it was shows nothing. You see the format string and then nothing. I guess that is because of everything being wrapped up by va_start. Back in printf_core, if the pointer is NULL, it switches to the string constant so that there is no NULL ptr. Instead it is 0x31f which is not NULL but is invalid. I'd say the real pointer's value got lost somewhere in that function.

printf_core is a very complicated function. I'm wondering if this might have been fixed in a later version?

stevegrubb avatar Apr 02 '24 18:04 stevegrubb

Sorry. I don't understand that. My C knowledge is very limited. Can I help finding out what s is? Will that help?

I see that musl released v1.2.5 a month ago. OpenWrt uses v1.2.4. I'll try to update.

M95D avatar Apr 02 '24 20:04 M95D

The %s means insert a string here. In this case, it's the whole audit record text as variable str. You can see the text was fine going into send_audit_event. But as I mentioned, the snprintf doesn't show all it's arguments for some reason. It's a one line function that reformats things for the next function. But the same str is passed to snprintf untouched. It should still point to whatever it entered with.

stevegrubb avatar Apr 02 '24 21:04 stevegrubb

Another idea is maybe there's stack corruption somehow or undefined behavior. Might be worth adding the address and undefined behavior sanitizer flags to see if they dig up anything.

stevegrubb avatar Apr 02 '24 22:04 stevegrubb

Thinking about it more, it looks like it's compile with stack-protector-strong. That would have detected any major stack corruption. Maybe the 4.0.1 release works? (Btw, the testing on musl is appreciated. It's found a couple problems that glibc hides.)

stevegrubb avatar Apr 03 '24 22:04 stevegrubb

I couldn't get ubsan or asan to work.

I did an update to musl v1.2.5. It only required changing the version in the build config. (Yes, it was that easy!) Unfortunately auditd v3.0.7 still segfaults with musl v1.2.5.

Your fix for v4.0.1 worked. I built a debug version and it runs ok with musl v1.2.5. It doesn't segfault anymore. Next I'll do a normal build, with stripping, and mold linker, and other stuff OpenWrt has, and test with the old musl v1.2.4. Since I had valgrind already installed, I did a test run with it. It's a long log. I don't know if any of those messages are normal or not, but I'm attaching it here in case you want to have a look at it. audit.valgrind.4.0.1.b.txt

Since v4.0.1 is working, I consider this bug resoved. But if you still want to find that bug in v3.x (for your curiosity or other reasons), I'm available for testing/debugging.

Thank you!

M95D avatar Apr 04 '24 15:04 M95D