zabbix-docker-monitoring
zabbix-docker-monitoring copied to clipboard
agent crash with compiled module 3.4.10-3.4.15
cat /etc/os-release
NAME="SLES"
VERSION="12-SP3"
VERSION_ID="12.3"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP3"
docker version
Client:
Version: 18.09.0
API version: 1.39
Go version: go1.10.4
Git commit: 33a45cd
Built: Wed Nov 7 00:25:11 2018
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Enterprise
Engine:
Version: 18.09.0
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 33a45cd
Built: Wed Nov 7 00:19:46 2018
OS/Arch: linux/amd64
Experimental: false
*** Error in `/usr/sbin/zabbix-agentd: listener #3 [processing request]': munmap_chunk(): invalid pointer: 0x00007f0fd32e4840 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x740ef)[0x7f0fd3b9e0ef]
/lib64/libc.so.6(+0x79646)[0x7f0fd3ba3646]
/usr/lib/modules/zabbix_module_docker.so(zbx_module_docker_net+0x636)[0x7f0fd3704cb3]
/usr/sbin/zabbix-agentd: listener #3 [processing request](process+0x353)[0x4185e3]
/usr/sbin/zabbix-agentd: listener #3 [processing request](listener_thread+0x1ad)[0x41513d]
/usr/sbin/zabbix-agentd: listener #3 [processing request](zbx_thread_start+0x3e)[0x42c45e]
/usr/sbin/zabbix-agentd: listener #3 [processing request](MAIN_ZABBIX_ENTRY+0x2c3)[0x417883]
/usr/sbin/zabbix-agentd: listener #3 [processing request](daemon_start+0x1a9)[0x42cf09]
/usr/sbin/zabbix-agentd: listener #3 [processing request](main+0x9e)[0x40d1fe]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f0fd3b4a725]
/usr/sbin/zabbix-agentd: listener #3 [processing request](_start+0x29)[0x40d309]
======= Memory map: ========
00400000-00457000 r-xp 00000000 fe:01 1052 /usr/sbin/zabbix-agentd
00656000-00657000 r--p 00056000 fe:01 1052 /usr/sbin/zabbix-agentd
00657000-00659000 rw-p 00057000 fe:01 1052 /usr/sbin/zabbix-agentd
00659000-0065e000 rw-p 00000000 00:00 0
00e22000-00e43000 rw-p 00000000 00:00 0 [heap]
00e43000-00e47000 rw-p 00000000 00:00 0 [heap]
7f0fd30cb000-7f0fd30e1000 r-xp 00000000 fe:00 299 /lib64/libgcc_s.so.1
7f0fd30e1000-7f0fd32e0000 ---p 00016000 fe:00 299 /lib64/libgcc_s.so.1
7f0fd32e0000-7f0fd32e1000 r--p 00015000 fe:00 299 /lib64/libgcc_s.so.1
7f0fd32e1000-7f0fd32e2000 rw-p 00016000 fe:00 299 /lib64/libgcc_s.so.1
7f0fd32e2000-7f0fd32e7000 r-xp 00000000 fe:00 350 /lib64/libnss_dns-2.22.so
7f0fd32e7000-7f0fd34e6000 ---p 00005000 fe:00 350 /lib64/libnss_dns-2.22.so
7f0fd34e6000-7f0fd34e7000 r--p 00004000 fe:00 350 /lib64/libnss_dns-2.22.so
7f0fd34e7000-7f0fd34e8000 rw-p 00005000 fe:00 350 /lib64/libnss_dns-2.22.so
7f0fd34e8000-7f0fd34f3000 r-xp 00000000 fe:00 1191 /lib64/libnss_files-2.22.so
7f0fd34f3000-7f0fd36f2000 ---p 0000b000 fe:00 1191 /lib64/libnss_files-2.22.so
7f0fd36f2000-7f0fd36f3000 r--p 0000a000 fe:00 1191 /lib64/libnss_files-2.22.so
7f0fd36f3000-7f0fd36f4000 rw-p 0000b000 fe:00 1191 /lib64/libnss_files-2.22.so
7f0fd36f4000-7f0fd36fa000 rw-p 00000000 00:00 0
7f0fd36fa000-7f0fd370c000 r-xp 00000000 fe:01 1182 /usr/lib/modules/zabbix_module_docker.so
7f0fd370c000-7f0fd390b000 ---p 00012000 fe:01 1182 /usr/lib/modules/zabbix_module_docker.so
7f0fd390b000-7f0fd390c000 r--p 00011000 fe:01 1182 /usr/lib/modules/zabbix_module_docker.so
7f0fd390c000-7f0fd390d000 rw-p 00012000 fe:01 1182 /usr/lib/modules/zabbix_module_docker.so
7f0fd390d000-7f0fd3925000 r-xp 00000000 fe:00 1631 /lib64/libpthread-2.22.so
7f0fd3925000-7f0fd3b24000 ---p 00018000 fe:00 1631 /lib64/libpthread-2.22.so
7f0fd3b24000-7f0fd3b25000 r--p 00017000 fe:00 1631 /lib64/libpthread-2.22.so
7f0fd3b25000-7f0fd3b26000 rw-p 00018000 fe:00 1631 /lib64/libpthread-2.22.so
7f0fd3b26000-7f0fd3b2a000 rw-p 00000000 00:00 0
7f0fd3b2a000-7f0fd3cc5000 r-xp 00000000 fe:00 140 /lib64/libc-2.22.so
7f0fd3cc5000-7f0fd3ec5000 ---p 0019b000 fe:00 140 /lib64/libc-2.22.so
7f0fd3ec5000-7f0fd3ec9000 r--p 0019b000 fe:00 140 /lib64/libc-2.22.so
7f0fd3ec9000-7f0fd3ecb000 rw-p 0019f000 fe:00 140 /lib64/libc-2.22.so
7f0fd3ecb000-7f0fd3ecf000 rw-p 00000000 00:00 0
7f0fd3ecf000-7f0fd3f3d000 r-xp 00000000 fe:01 1207 /usr/lib64/libpcre.so.1.2.7
7f0fd3f3d000-7f0fd413c000 ---p 0006e000 fe:01 1207 /usr/lib64/libpcre.so.1.2.7
7f0fd413c000-7f0fd413d000 r--p 0006d000 fe:01 1207 /usr/lib64/libpcre.so.1.2.7
7f0fd413d000-7f0fd413e000 rw-p 0006e000 fe:01 1207 /usr/lib64/libpcre.so.1.2.7
7f0fd413e000-7f0fd4152000 r-xp 00000000 fe:00 1659 /lib64/libresolv-2.22.so
7f0fd4152000-7f0fd4351000 ---p 00014000 fe:00 1659 /lib64/libresolv-2.22.so
7f0fd4351000-7f0fd4352000 r--p 00013000 fe:00 1659 /lib64/libresolv-2.22.so
7f0fd4352000-7f0fd4353000 rw-p 00014000 fe:00 1659 /lib64/libresolv-2.22.so
7f0fd4353000-7f0fd4355000 rw-p 00000000 00:00 0
7f0fd4355000-7f0fd4357000 r-xp 00000000 fe:00 306 /lib64/libdl-2.22.so
7f0fd4357000-7f0fd4557000 ---p 00002000 fe:00 306 /lib64/libdl-2.22.so
7f0fd4557000-7f0fd4558000 r--p 00002000 fe:00 306 /lib64/libdl-2.22.so
7f0fd4558000-7f0fd4559000 rw-p 00003000 fe:00 306 /lib64/libdl-2.22.so
7f0fd4559000-7f0fd4654000 r-xp 00000000 fe:00 328 /lib64/libm-2.22.so
7f0fd4654000-7f0fd4854000 ---p 000fb000 fe:00 328 /lib64/libm-2.22.so
7f0fd4854000-7f0fd4855000 r--p 000fb000 fe:00 328 /lib64/libm-2.22.so
7f0fd4855000-7f0fd4856000 rw-p 000fc000 fe:00 328 /lib64/libm-2.22.so
7f0fd4856000-7f0fd4877000 r-xp 00000000 fe:00 38 /lib64/ld-2.22.so
7f0fd4a07000-7f0fd4a61000 rw-s 00000000 00:05 229376 /SYSV00000000 (deleted)
7f0fd4a61000-7f0fd4a66000 rw-p 00000000 00:00 0
7f0fd4a75000-7f0fd4a76000 rw-p 00000000 00:00 0
7f0fd4a76000-7f0fd4a77000 rw-p 00000000 00:00 0
7f0fd4a77000-7f0fd4a78000 r--p 00021000 fe:00 38 /lib64/ld-2.22.so
7f0fd4a78000-7f0fd4a79000 rw-p 00022000 fe:00 38 /lib64/ld-2.22.so
7f0fd4a79000-7f0fd4a7a000 rw-p 00000000 00:00 0
7ffda273c000-7ffda275d000 rw-p 00000000 00:00 0 [stack]
7ffda279f000-7ffda27a2000 r--p 00000000 00:00 0 [vvar]
7ffda27a2000-7ffda27a4000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
50660:20181212:134920.403 found metric TX-OK: 44000823
50660:20181212:134920.403 Sending back [44000823]
50660:20181212:134920.404 __zbx_zbx_setproctitle() title:'listener #1 [waiting for connection]'
50658:20181212:134920.406 One child process died (PID:50662,exitcode/signal:6). Exiting ...
50658:20181212:134920.406 zbx_on_exit() called
50659:20181212:134920.406 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
50663:20181212:134920.406 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
50660:20181212:134920.406 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
50661:20181212:134920.407 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
zabbix-agentd [50658]: Error waiting for process with PID 50662: [10] No child processes
50658:20181212:134920.407 In zbx_dshm_destroy() shmid:-1
50658:20181212:134920.407 End of zbx_dshm_destroy():SUCCEED
50658:20181212:134920.407 In zbx_unload_modules()
50658:20181212:134920.407 In zbx_module_uninit()
50658:20181212:134920.408 End of zbx_unload_modules()
50658:20181212:134920.408 Zabbix Agent stopped. Zabbix 3.4.15 (revision 86739).
Did you compile the module for your system with correct Zabbix version? Could you provide more logs before backtrace, please?
Did you compile the module for your system with correct Zabbix version? Could you provide more logs before backtrace, please?
Yes. I have compiled the correct version. These hosts are running as manager-worker mode. On the host without a cluster everything is fine. Some logs:
65158:20181212:145638.254 Requested [docker.mem[d8dc14731a9ca28c2c8f4b2c3063db03f752b751ef40bf17d0c07e169a3e2918,total_cache]]
65158:20181212:145638.254 In zbx_module_docker_mem()
65158:20181212:145638.254 In zbx_module_docker_get_fci()
65158:20181212:145638.254 Original full container id will be used
65158:20181212:145638.254 Metric source file: /sys/fs/cgroup/memory/docker/d8dc14731a9ca28c2c8f4b2c3063db03f752b751ef40bf17d0c07e169a3e2918/memory.stat
65158:20181212:145638.254 Looking metric total_cache in memory.stat file
65158:20181212:145638.254 Id: d8dc14731a9ca28c2c8f4b2c3063db03f752b751ef40bf17d0c07e169a3e2918; metric: total_cache; value: 77406208
65158:20181212:145638.254 Sending back [77406208]
65158:20181212:145638.255 __zbx_zbx_setproctitle() title:'listener #3 [waiting for connection]'
65158:20181212:145638.256 __zbx_zbx_setproctitle() title:'listener #3 [processing request]'
65158:20181212:145638.257 Requested [docker.mem[3fd2b78b602d02a879dffb33a0073725d38dc04c48959a50b5b115dae7feba9b,total_rss]]
65158:20181212:145638.257 In zbx_module_docker_mem()
65158:20181212:145638.257 In zbx_module_docker_get_fci()
65158:20181212:145638.257 Original full container id will be used
65158:20181212:145638.257 Metric source file: /sys/fs/cgroup/memory/docker/3fd2b78b602d02a879dffb33a0073725d38dc04c48959a50b5b115dae7feba9b/memory.stat
65158:20181212:145638.258 Looking metric total_rss in memory.stat file
65158:20181212:145638.258 Id: 3fd2b78b602d02a879dffb33a0073725d38dc04c48959a50b5b115dae7feba9b; metric: total_rss; value: 73814016
65158:20181212:145638.258 Sending back [73814016]
65158:20181212:145638.258 __zbx_zbx_setproctitle() title:'listener #3 [waiting for connection]'
65157:20181212:145638.260 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
65157:20181212:145638.261 Requested [docker.mem[598fb024a76008b3919ba2debe37319d9a96d2a90dce521f23b6dc7c3dd2a648,total_swap]]
65157:20181212:145638.261 In zbx_module_docker_mem()
65157:20181212:145638.261 In zbx_module_docker_get_fci()
65157:20181212:145638.261 Original full container id will be used
65157:20181212:145638.261 Metric source file: /sys/fs/cgroup/memory/docker/598fb024a76008b3919ba2debe37319d9a96d2a90dce521f23b6dc7c3dd2a648/memory.stat
65157:20181212:145638.261 Cannot open metric file: '/sys/fs/cgroup/memory/docker/598fb024a76008b3919ba2debe37319d9a96d2a90dce521f23b6dc7c3dd2a648/memory.stat'
65157:20181212:145638.261 Sending back [ZBX_NOTSUPPORTED: Cannot open memory.stat file]
65157:20181212:145638.261 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
65157:20181212:145638.263 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
65157:20181212:145638.264 Requested [docker.up[f508d3c86f17820bf51dea6517045a1ce6dddc457d53ec397c61309ecd6b090e]]
65157:20181212:145638.264 In zbx_module_docker_up()
65157:20181212:145638.264 In zbx_module_docker_get_fci()
65157:20181212:145638.264 Original full container id will be used
65157:20181212:145638.264 Metric source file: /sys/fs/cgroup/cpu,cpuacct/docker/f508d3c86f17820bf51dea6517045a1ce6dddc457d53ec397c61309ecd6b090e/cpuacct.stat
65157:20181212:145638.264 Cannot open metric file: '/sys/fs/cgroup/cpu,cpuacct/docker/f508d3c86f17820bf51dea6517045a1ce6dddc457d53ec397c61309ecd6b090e/cpuacct.stat', container doesn't run
65157:20181212:145638.264 Sending back [0]
65157:20181212:145638.264 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
65157:20181212:145638.266 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
65157:20181212:145638.267 Requested [docker.xnet[f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3,eth0,RX-OK]]
65157:20181212:145638.267 In zbx_module_docker_net()
65157:20181212:145638.267 In zbx_module_docker_get_fci()
65157:20181212:145638.267 Original full container id will be used
65157:20181212:145638.267 netns file: /var/run/netns/zabbix_module_docker_f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3
65157:20181212:145638.267 Tasks file: /sys/fs/cgroup/devices/docker/f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3/tasks
65157:20181212:145638.267 Cannot open Docker tasks file: '/sys/fs/cgroup/devices/docker/f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3/tasks'
65157:20181212:145638.267 Sending back [ZBX_NOTSUPPORTED: Cannot open Docker tasks file]
65157:20181212:145638.267 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
65157:20181212:145638.273 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
65157:20181212:145638.274 Requested [docker.xnet[71b227a3c00d0b6862cd82187d9bcd68be4698ece453bc90c3ff8dd6bc3b6f26,eth0,RX-OK]]
65157:20181212:145638.274 In zbx_module_docker_net()
65157:20181212:145638.274 In zbx_module_docker_get_fci()
65157:20181212:145638.274 Original full container id will be used
65157:20181212:145638.274 netns file: /var/run/netns/zabbix_module_docker_71b227a3c00d0b6862cd82187d9bcd68be4698ece453bc90c3ff8dd6bc3b6f26
65157:20181212:145638.274 Tasks file: /sys/fs/cgroup/devices/docker/71b227a3c00d0b6862cd82187d9bcd68be4698ece453bc90c3ff8dd6bc3b6f26/tasks
*** Error in `/usr/sbin/zabbix-agentd: listener #2 [processing request]': munmap_chunk(): invalid pointer: 0x00007f9ea11ce840 ***
Problem is with docker.xnet
. Did you fulfill requirements mentioned in the Readme?
Note 1: Root permissions (AllowRoot=1) are required, because net namespaces (/var/run/netns/) are created/used Note 2: netstat is needed to be installed and available in PATH
- AllowRoot=1 is set
- Netstat is installed and available.
Some network data appeared in Zabbix before the agent died.
Probably it is crashing somewhere in this part https://github.com/monitoringartist/zabbix-docker-monitoring/blob/dba2fb727e411493bcc4e540d5bac681836d12fc/src/modules/zabbix_module_docker/zabbix_module_docker.c#L1286-L1306
Probably some pointer for free
function is not valid. It will require deeper investigation to prove it.
I also have this problem, but I have zabbix-agent version 4.4.3. Debian 9.9
But I downgrade my zabbix-agent to 4.2.8 and compiled .so - it worked!
same problem: os: ubuntu 18.04, debian 9, debian 10; agent version: 5.0.12 zabbix_module_docker.so was downloaded from master branch.
It looks to me that the problem is here: https://github.com/monitoringartist/zabbix-docker-monitoring/blob/fd3f6e818e31989972f15fbe86079573fc1c6608/src/modules/zabbix_module_docker/zabbix_module_docker.c#L1274-L1280 If fgets()
fails, then loop body is never executed and first_task
is not initialized and subsequent attempt to release memory: https://github.com/monitoringartist/zabbix-docker-monitoring/blob/fd3f6e818e31989972f15fbe86079573fc1c6608/src/modules/zabbix_module_docker/zabbix_module_docker.c#L1290 ...will lead to a crash.
The solution would be to convert this while
loop into if else
construct. However, I don't know what to put in else
branch, because I am looking at it purely from C developer's perspective. @jangaraj and the rest, what does it mean if Tasks file is empty? How should module behave in this case?