fluent-plugin-systemd fails with SIGABORT on Ubuntu 21.04
When using fluent-plugin-systemd the worker crashes hard with a SIGABRT. Initially we assumed it to be a problem with the plugin, but it turned out to be related to libjemalloc. After removing
Environment=LD_PRELOAD=/opt/td-agent/lib/libjemalloc.so
from the service, crashes are gone.
See https://github.com/ledbettj/systemd-journal/issues/93 for more details.
We were using td-agent 3 in the above example, but the issue is the same with td-agent 4.
Some more info:
- Ubuntu 21.10
- td-agent 4.3.0-1 from http://packages.treasuredata.com/4/ubuntu/focal/
Related config part:
<source>
@type systemd
tag systemd
path /var/log/journal
<storage>
@type local
persistent true
path /var/tmp/fluentd_systemd
</storage>
<entry>
fields_strip_underscores true
fields_lowercase true
</entry>
</source>
Let me know, in case I can help with further details.
This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days
This issue was automatically closed because of stale in 30 days
Seems like this problem still exists.
Environment
- Ubuntu 22.04
td-agent 4.4.1 fluentd 1.15.2 (c32842297ed2c306f1b841a8f6e55bdd0f1cb27f)- Installed by
$ curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-jammy-td-agent4.sh | sh
- Installed by
How to Reproduce
- Install fluent-plugin-systemd plugin:
$ td-agent-gem install fluent-plugin-systemd - Add the following setting
<source>
@type systemd
tag debug
path /var/log/journal
read_from_head true
</source>
$ (sudo) adduser td-agent systemd-journal$ (sudo) systemctl restart td-agent
Result
- After reading one record, then the worker dies with SIGABRT.
2022-10-12 05:58:38 +0000 [info]: #0 fluentd worker is now running worker=0
2022-10-12 04:50:35.083947000 +0000 debug: {"SYSLOG_FACILITY":"3","SYSLOG_IDENTIFIER":"systemd-journald","_TRANSPORT":"driver","PRIORITY":"6","MESSAGE_ID":"f77379a8490b408bbe5f6940505a777b","MESSAGE":"Journal started","_PID":"57","_UID":"0","_GID":"0","_COMM":"systemd-journal","_EXE":"/usr/lib/systemd/systemd-journald","_CMDLINE":"/lib/systemd/systemd-journald","_CAP_EFFECTIVE":"25402800cf","_SELINUX_CONTEXT":"unconfined\n","_SYSTEMD_CGROUP":"/system.slice/systemd-journald.service","_SYSTEMD_UNIT":"systemd-journald.service","_SYSTEMD_SLICE":"system.slice","_SYSTEMD_INVOCATION_ID":"ad56283776054be3859ad9b4e1f962d5","_BOOT_ID":"eb783cbf1e3c47c0a680a80b99e356d9","_MACHINE_ID":"6981994dead5402094f9195aec951d36","_HOSTNAME":"jammy-td-agetn"}
2022-10-12 05:58:39 +0000 [error]: Worker 0 finished unexpectedly with signal SIGABRT
ETC
This doesn't reproduce on Ubuntu 20.04.
As @scrwr says, we can avoid this issue by commenting out the following line in /lib/systemd/system/td-agent.service
Environment=LD_PRELOAD=/opt/td-agent/lib/libjemalloc.so
Then apply this.
$ (sudo) systemctl daemon-reload
$ (sudo) systemctl restart td-agent
However, is it correct to comment out this?
However, is it correct to comment out this?
It is not recommended to edit directly /lib/systemd/system/td-agent.service.
The correct way to change environment variables would be as follows for Ubuntu:
- Edit
/etc/default/td-agentand add the following line:
LD_PRELOAD=
- Then restart td-agent:
$ (sudo) systemctl restart td-agent
My concern here was the effect of omitting this environment variable, but it seems that if memory usage is not a problem, this environment variable can be omitted.
Thus, for now, this seems to be a workaround.
having the same issue. workaround helped, but I wonder if there is any progress on a permanent fix?
I think there is no progress. We still need this workaround for fluent-plugin-systemd in some environments.
@mszabo Could you share your environment information? Are you using Ubuntu?
@daipom yes, issue surfaced when we started to migrate to the latest ubuntu LTS. (22.04).
Thanks!
Hello,
Migrated my app from RHEL 8.8 to RHEL 9.2 and started experiencing the same issue:
2023-10-05 11:06:29 +0200 [error]: Worker 0 exited unexpectedly with signal SIGABRT
The workaround with unsetting LD_PRELOAD var helped. Posting my env info in case it may help with the permanent fix.
- OS: RHEL 9.2
- kernel version 5.14.0-284.30.1
- td-agent package version 4.5.1-1
- ruby version 3.1.4p223 (bundled with td-agent)
- fluent-plugin-systemd gem version 1.0.5
- systemd-journal gem version 1.4.2
Thanks, Andrii