beats
beats copied to clipboard
Make Journald input resilient to Journald errors
Currently if there is any error reading the next message from Journald, the input will stop working and never recover, effectively stopping ingestion and never recovering.
This happens because any error reading a new message or publishing a message https://github.com/elastic/beats/blob/ffcd1814666645a5d7a644911ecf6e2b7d8db3f5/filebeat/input/journald/input.go#L163-L173 is returned by the Run
method that was called in a goroutine that logs it and then exits https://github.com/elastic/beats/blob/ffcd1814666645a5d7a644911ecf6e2b7d8db3f5/filebeat/input/v2/compat/compat.go#L119-L135
We need to make the Journald input more resilient to errors we get when calling the host's journald via github.com/coreos/go-systemd/v22/sdjournal
.
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
Even after the merge of https://github.com/elastic/beats/pull/40061 and the migration to using journalctl
this issue is still relevant, if journalctl
crashes the input finishes and the ingestion of journal messages stops.