beats
beats copied to clipboard
High io consumption after sudden filebeat stop
Hi! I tried to ask on discuss.elastic.co but no answer.
The problem is very high io, after sudden termination of a filebeat. The reason is a checkpoint action on each log operation. It is because of log_invalid flag set to true, after failed initial log read operation. After abnormal termination of a filebeat, log may be in a inconsistent state and read of log like this can cause error Incomplete or corrupted log file in /usr/share/filebeat/data/registry/filebeat. Continue with last known complete and consistent state. Reason: invalid character '\\x00' looking for beginning of value
After that, filebeat clears log file, but still not trying to write, and just make checkpoint by checkpoint.
- Version: 8.1.0 but i think bug still in the master
- Operating System: Ubuntu 18.04 kernel 5.4.0-139-generic
- Discuss Forum URL: https://discuss.elastic.co/t/high-iops-from-filebeat/334399
- Steps to Reproduce:
- Start filebeat
- Shutdown machine suddenly
- Start machine again
- Start filebeat
- Check the log for an errors
We are seeing the same issue: https://discuss.elastic.co/t/filebeat-causing-a-very-large-iowait-and-lagging-after-uncontrolled-reboot/351981
@elastic/obs-dc can anyone help here?
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
Hey folks, thanks for finding this bug and proposing a fix! Looking at the code I can see it indeed is a bug. Restarting Filebeat should bring it back into a consistent state. While not perfect, it is at least a workaround.