netdata
netdata copied to clipboard
[Bug]: SIGSEGV logs management
Bug description
On Arch Linux
I have observed the following coredump:
#0 0x00007f25265158c7 in __GI___regexec (preg=preg@entry=0x55b5e04a5920 <req_client_regex>, string=string@entry=0x7ffcf4ea02da "::1", nmatch=nmatch@entry=0, pmatch=pmatch@entry=0x0, eflags=eflags@entry=0)
at /usr/src/debug/glibc/glibc/posix/regexec.c:214
Downloading source file /usr/src/debug/glibc/glibc/posix/regexec.c
214 lock_lock (dfa->lock);
[Current thread is 1 (Thread 0x7f2526f89640 (LWP 484))]
(gdb) bt
#0 0x00007f25265158c7 in __GI___regexec (preg=preg@entry=0x55b5e04a5920 <req_client_regex>, string=string@entry=0x7ffcf4ea02da "::1", nmatch=nmatch@entry=0, pmatch=pmatch@entry=0x0, eflags=eflags@entry=0)
at /usr/src/debug/glibc/glibc/posix/regexec.c:214
#1 0x00007f252657515d in __compat_regexec (preg=preg@entry=0x55b5e04a5920 <req_client_regex>, string=string@entry=0x7ffcf4ea02da "::1", nmatch=nmatch@entry=0, pmatch=pmatch@entry=0x0, eflags=eflags@entry=0)
at /usr/src/debug/glibc/glibc/posix/regexec.c:240
#2 0x000055b5e02b9585 in parse_web_log_line (wblp_config=wblp_config@entry=0x55b5e1343d30, line=line@entry=0x55b5e1343ce0 "::1 - - [09/Nov/2023:19:54:09 +0000] \"GET / HTTP/1.0\" 200 481\n", line_len=61,
log_line_parsed=log_line_parsed@entry=0x7ffcf4ea01d0) at /home/thiago/Netdata/netdata/logsmanagement/parser.c:633
#3 0x000055b5e02bb956 in auto_detect_web_log_parser_config (line=line@entry=0x55b5e1343ce0 "::1 - - [09/Nov/2023:19:54:09 +0000] \"GET / HTTP/1.0\" 200 481\n", delimiter=delimiter@entry=32 ' ')
at /home/thiago/Netdata/netdata/logsmanagement/parser.c:1490
#4 0x000055b5e02b5f2d in config_section_init (main_loop=main_loop@entry=0x55b5e1304470, config_section=config_section@entry=0x55b5e1323430, forward_in_config=forward_in_config@entry=0x0,
p_flb_srvc_config=p_flb_srvc_config@entry=0x7ffcf4ea2830, stdout_mut=stdout_mut@entry=0x55b5e04a5880 <stdout_mut>) at /home/thiago/Netdata/netdata/logsmanagement/logsmanag_config.c:886
#5 0x000055b5e02b816c in config_file_load (main_loop=0x55b5e1304470, p_forward_in_config=0x0, p_flb_srvc_config=p_flb_srvc_config@entry=0x7ffcf4ea2830, stdout_mut=stdout_mut@entry=0x55b5e04a5880 <stdout_mut>)
at /home/thiago/Netdata/netdata/logsmanagement/logsmanag_config.c:1406
#6 0x000055b5e02a352c in main (argc=<optimized out>, argv=<optimized out>) at /home/thiago/Netdata/netdata/logsmanagement/logsmanagement.c:168
When our log management is enabled.
Expected behavior
Plugin should not crash and run normally.
Steps to reproduce
- Compile on
Arch
. - Enable plugin inside
netdata.conf
- Set the configuration file
/etc/netdata/logsmanagement.d.conf
:
[global]
enabled = yes
update every = 1
update timeout = 10
use log timestamp = auto
circular buffer max size MiB = 64
circular buffer drop logs if full = no
compression acceleration = 1
collected logs total chart enable = no
collected logs rate chart enable = yes
[db]
db mode = full
db dir = /var/cache/netdata/logs_management_db
circular buffer flush to db = 6
disk space limit MiB = 500
[forward input]
enabled = no
unix path =
unix perm = 0644
listen = 0.0.0.0
port = 24224
[fluent bit]
flush = 0.1
http listen = 0.0.0.0
http port = 2020
http server = false
log file = /var/log/netdata/fluentbit.log
log level = info
coro stack size = 24576
...
Installation method
from source
System info
Linux archlinux 6.6.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 14 Dec 2023 03:45:42 +0000 x86_64 GNU/Linux
/etc/os-release:NAME="Arch Linux"
/etc/os-release:PRETTY_NAME="Arch Linux"
/etc/os-release:ID=arch
/etc/os-release:BUILD_ID=rolling
/etc/os-release:ANSI_COLOR="38;2;23;147;209"
/etc/os-release:LOGO=archlinux-logo
Netdata build info
Packaging:
Netdata Version ____________________________________________ : v1.44.0-77-nightly
Installation Type __________________________________________ : custom
Package Architecture _______________________________________ : unknown
Package Distro _____________________________________________ : unknown
Configure Options __________________________________________ : dummy-configure-command
Default Directories:
User Configurations ________________________________________ : /etc/netdata
Stock Configurations _______________________________________ : /usr/lib/netdata/conf.d
Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
Permanent Databases ________________________________________ : /var/lib/netdata
Plugins ____________________________________________________ : /usr/libexec/netdata/plugins.d
Static Web Files ___________________________________________ : /usr/share/netdata/web
Log Files __________________________________________________ : /var/log/netdata
Lock Files _________________________________________________ : /var/lib/netdata/lock
Home _______________________________________________________ : /var/lib/netdata
Operating System:
Kernel _____________________________________________________ : Linux
Kernel Version _____________________________________________ : 6.6.7-arch1-1
Operating System ___________________________________________ : Arch Linux
Operating System ID ________________________________________ : arch
Operating System ID Like ___________________________________ : unknown
Operating System Version ___________________________________ : unknown
Operating System Version ID ________________________________ : none
Detection __________________________________________________ : /etc/os-release
Hardware:
CPU Cores __________________________________________________ : 2
CPU Frequency ______________________________________________ : 2903000000
RAM Bytes __________________________________________________ : 1011892224
Disk Capacity ______________________________________________ : 21474836480
CPU Architecture ___________________________________________ : x86_64
Virtualization Technology __________________________________ : kvm
Virtualization Detection ___________________________________ : systemd-detect-virt
Container:
Container __________________________________________________ : none
Container Detection ________________________________________ : systemd-detect-virt
Container Orchestrator _____________________________________ : none
Container Operating System _________________________________ : none
Container Operating System ID ______________________________ : none
Container Operating System ID Like _________________________ : none
Container Operating System Version _________________________ : none
Container Operating System Version ID ______________________ : none
Container Operating System Detection _______________________ : none
Features:
Built For __________________________________________________ : Linux
Netdata Cloud ______________________________________________ : YES
Health (trigger alerts and send notifications) _____________ : YES
Streaming (stream metrics to parent Netdata servers) _______ : YES
Back-filling (of higher database tiers) ____________________ : YES
Replication (fill the gaps of parent Netdata servers) ______ : YES
Streaming and Replication Compression ______________________ : YES (brotli zstd lz4 gzip)
Contexts (index all active and archived metrics) ___________ : YES
Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
Machine Learning ___________________________________________ : NO
Database Engines:
dbengine ___________________________________________________ : YES
alloc ______________________________________________________ : YES
ram ________________________________________________________ : YES
map ________________________________________________________ : YES
save _______________________________________________________ : YES
none _______________________________________________________ : YES
Connectivity Capabilities:
ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : YES
static (Netdata internal web server) _______________________ : YES
h2o (web server) ___________________________________________ : YES
WebRTC (experimental) ______________________________________ : NO
Native HTTPS (TLS Support) _________________________________ : YES
TLS Host Verification ______________________________________ : YES
Libraries:
LZ4 (extremely fast lossless compression algorithm) ________ : YES
ZSTD (fast, lossless compression algorithm) ________________ : YES
zlib (lossless data-compression library) ___________________ : YES
protobuf (platform-neutral data serialization protocol) ____ : YES (bundled)
OpenSSL (cryptography) _____________________________________ : YES
libdatachannel (stand-alone WebRTC data channels) __________ : NO
JSON-C (lightweight JSON manipulation) _____________________ : YES
libcap (Linux capabilities system operations) ______________ : YES
libcrypto (cryptographic functions) ________________________ : YES
Plugins:
apps (monitor processes) ___________________________________ : YES
cgroups (monitor containers and VMs) _______________________ : YES
cgroup-network (associate interfaces to CGROUPS) ___________ : YES
proc (monitor Linux systems) _______________________________ : YES
tc (monitor Linux network QoS) _____________________________ : YES
diskspace (monitor Linux mount points) _____________________ : YES
freebsd (monitor FreeBSD systems) __________________________ : NO
macos (monitor MacOS systems) ______________________________ : NO
statsd (collect custom application metrics) ________________ : YES
timex (check system clock synchronization) _________________ : YES
idlejitter (check system latency and jitter) _______________ : YES
bash (support shell data collection jobs - charts.d) _______ : YES
debugfs (kernel debugging metrics) _________________________ : YES
cups (monitor printers and print jobs) _____________________ : NO
ebpf (monitor system calls) ________________________________ : YES
freeipmi (monitor enterprise server H/W) ___________________ : NO
nfacct (gather netfilter accounting) _______________________ : NO
perf (collect kernel performance events) ___________________ : YES
slabinfo (monitor kernel object caching) ___________________ : YES
Xen ________________________________________________________ : NO
Xen VBD Error Tracking _____________________________________ : NO
Logs Management ____________________________________________ : YES
Exporters:
AWS Kinesis ________________________________________________ : NO
GCP PubSub _________________________________________________ : NO
MongoDB ____________________________________________________ : NO
Prometheus (OpenMetrics) Exporter __________________________ : YES
Prometheus Remote Write ____________________________________ : NO
Graphite ___________________________________________________ : YES
Graphite HTTP / HTTPS ______________________________________ : YES
JSON _______________________________________________________ : YES
JSON HTTP / HTTPS __________________________________________ : YES
OpenTSDB ___________________________________________________ : YES
OpenTSDB HTTP / HTTPS ______________________________________ : YES
All Metrics API ____________________________________________ : YES
Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
Trace All Netdata Allocations (with charts) ________________ : NO
Developer Mode (more runtime checks, slower) _______________ : YES
Additional info
I could not see this issue on all Linux distributiions.
I am also experiencing a SIGEGV caused by the logs management plugin
sudo coredumpctl -1 debug
PID: 238230 (logs-management)
UID: 998 (netdata)
GID: 999 (netdata)
Signal: 11 (SEGV)
Timestamp: Tue 2024-01-02 19:09:54 UTC (1h 58min ago)
Command Line: /usr/libexec/netdata/plugins.d/logs-management.plugin 1
Executable: /usr/libexec/netdata/plugins.d/logs-management.plugin
Control Group: /system.slice/netdata.service
Unit: netdata.service
Slice: system.slice
Boot ID: fa34680522df49e1a31789f385aed95d
Machine ID: 22a3a6ef70f741b0b60db5afec90e682
Hostname: netdata
Storage: /var/lib/systemd/coredump/core.logs-management.998.fa34680522df49e1a31789f385aed95d.238230.1704222594000000.zst (present)
Disk Size: 224.3K
Message: Process 238230 (logs-management) of user 998 dumped core.
Found module /usr/libexec/netdata/plugins.d/logs-management.plugin with build-id: 6dfc00592da08e8fd6831aeee2328618c8b6f6b4
Found module linux-vdso.so.1 with build-id: ea99e5d980dd1a4d23af20aa35a7d823b5c92f97
Found module libssl.so.3 with build-id: ce838f6c51f037b73ade040b4abd647d7ae7d62d
Found module libgpg-error.so.0 with build-id: 3fbec71c67bee60d8aef00697ee187079b0fb307
Found module ld-linux-x86-64.so.2 with build-id: cccdd41e22e25f77a8cda3d045c57ffdb01a9793
Found module libgcrypt.so.20 with build-id: 60a5e524de0ed8323edf33e9eb9127a9eee02359
Found module libcap.so.2 with build-id: b4bf900abf14aabe12d90988ceb30888acb2bcb0
Found module libzstd.so.1 with build-id: 5d9d0d946a3154a748e87e17af9d14764519237b
Found module liblzma.so.5 with build-id: b85da6c48eb60a646615392559483b93617ef265
Found module libc.so.6 with build-id: 203de0ae33b53fee1578b117cb4123e85d0534f0
Found module libgcc_s.so.1 with build-id: e3a44e0da9c6e835d293ed8fd2882b4c4a87130c
Found module libm.so.6 with build-id: 9f3c01b284b7fd2427aa8ae047f2720e12a4d396
Found module libcrypto.so.3 with build-id: 156e054fb88f59a4100ca7edc74a79e3908027a8
Found module libuv.so.1 with build-id: ff2c8af1d41a623ee738cb5839fb10384ad1c65f
Found module libuuid.so.1 with build-id: 64c0d0cb22fa2bdeca075a0c0418ba5ff314b220
Found module liblz4.so.1 with build-id: a85971851cd059f1af80d553c8e7170d42ec59a1
Found module libsystemd.so.0 with build-id: e45f7492c0f62251620378d7224ad0371a8d1f98
Stack trace of thread 242729:
#0 0x00007f604f43c49e uv_timer_stop (libuv.so.1 + 0xa49e)
#1 0x00005567dbbf5800 n/a (/usr/libexec/netdata/plugins.d/logs-management.plugin + 0x20800)
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/libexec/netdata/plugins.d/logs-management.plugin...
[New LWP 242729]
[New LWP 238230]
[New LWP 238298]
[New LWP 238299]
[New LWP 242728]
[New LWP 238302]
[New LWP 238303]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/libexec/netdata/plugins.d/logs-management.plugin 1'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f604f43c49e in uv_timer_stop () from /lib/x86_64-linux-gnu/libuv.so.1
[Current thread is 1 (Thread 0x7f604e31d640 (LWP 242729))]
(gdb) bt full
#0 0x00007f604f43c49e in uv_timer_stop () from /lib/x86_64-linux-gnu/libuv.so.1
No symbol table info available.
#1 0x00005567dbbf5800 in p_file_info_destroy (arg=0x5567dd8148d0) at /home/max/netdata/logsmanagement/logsmanag_config.c:106
p_file_info = <optimized out>
__FUNCTION__ = "p_file_info_destroy"
chartname = "logs_manag_systemd_logs\000S4\326N`\177\000\000\340\b\000H`\177\000\000\200|\355N`\177\000\000\340\b\000H`\177\000\000\030\361\377\377\377\377\377\377P\361\377\377\377\377\377\377\343\065\326N`\177\000\000@\326\061N`\177\000\000`\v\000H`\177\000\000\340\261\002"
output_next = <optimized out>
#2 0x00007f604ed52ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140731648024880, 8131616884517824162, 140051605476928, 9, 140051616180176, 140731648025232, -8207297833926072670, -8207295940654864734}, mask_was_saved = 0}}, priv = {pad = {0x0,
0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#3 0x00007f604ede4660 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
(gdb) where
#0 0x00007f604f43c49e in uv_timer_stop () from /lib/x86_64-linux-gnu/libuv.so.1
#1 0x00005567dbbf5800 in p_file_info_destroy (arg=0x5567dd8148d0) at /home/max/netdata/logsmanagement/logsmanag_config.c:106
#2 0x00007f604ed52ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3 0x00007f604ede4660 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Closing this issue because we don't plan to fix it - as of today logsmanagement.plugin is unmaintained and not supported.