Optimizing Filebeat for Reading Only New Text Files
After the Filebeat setup is completed through the MSI, Filebeat takes too long and uses too much CPU for scanning all text files, including older files. We never delete the logs folder after uninstalling the Filebeat MSI. I wonder why it reads the existing text files again, which were already scanned by Filebeat. Could you please advise if there is an option to avoid reading the existing text files and instead read only the new text files
### Tasks
- [ ] https://github.com/elastic/beats/pull/39744
Filebeat, by default, attempts to monitor all files within the specified paths in its configuration, including older files, which can lead to high CPU usage and longer scanning times. To configure Filebeat to read only new text files and avoid re-reading existing files that were already scanned, you can use a combination of configuration options.
Here are some steps and configurations you can apply to optimize Filebeat's performance:
1. Registry File
Filebeat maintains a registry file that keeps track of the state of files it has already read. Ensure that the registry file is not deleted when uninstalling or reinstalling Filebeat. This file is typically located at C:\ProgramData\filebeat\registry\filebeat on Windows.
2. Configure the Prospectors
Adjust your Filebeat configuration to specify file input settings carefully. Here’s an example configuration in the filebeat.yml file to ensure Filebeat focuses on new files:
Example filebeat.yml Configuration:
filebeat.inputs:
- type: log
enabled: true
paths:
- C:\path\to\your\logs\*.log
# Ignore older files
ignore_older: 24h # Adjust the time period according to your needs
# To handle large files, configure scan_frequency and clean_inactive
scan_frequency: 10s # How often to scan for new files
# Clean files older than this time period from the registry
clean_inactive: 48h # Adjust the time period according to your needs
# Ensure that the registry file is kept
filebeat.registry.path: "C:/ProgramData/filebeat/registry"
3. Ignore Older Files
The ignore_older setting ensures that Filebeat ignores files older than the specified duration. This helps reduce the load by preventing Filebeat from scanning and processing old log files.
4. Scan Frequency
The scan_frequency setting controls how often Filebeat scans for new files. Reducing the scan frequency can also help decrease CPU usage.
5. Clean Inactive Files
The clean_inactive setting removes state entries from the registry file for files that are older than the specified duration. This ensures that the registry file does not grow indefinitely and helps Filebeat focus on new files.
6. Delete Older Log Files
Regularly clean up old log files from the directory if they are no longer needed. This helps in reducing the number of files Filebeat needs to scan.
7. Filebeat Modules
If applicable, consider using Filebeat modules designed for specific log types. Modules are preconfigured to handle specific log formats and can optimize performance for those types of logs.
Implementation Steps:
- Edit the Configuration File: Open the
filebeat.ymlconfiguration file located in the Filebeat installation directory. - Modify Settings: Apply the settings as shown in the example above.
- Restart Filebeat: After modifying the configuration, restart the Filebeat service to apply the changes.
# Restart Filebeat on Windows
Restart-Service filebeat
Troubleshooting:
- Verify Registry File: Ensure the registry file is maintained across Filebeat installations.
- Check Logs: Monitor Filebeat logs to ensure it’s ignoring older files and only reading new files as configured.
- Resource Monitoring: Use Windows Task Manager or similar tools to monitor CPU usage and ensure the configuration changes are effective.
By applying these configurations and practices, you can optimize Filebeat to read only new text files and reduce CPU usage, thereby improving overall performance.
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
Closing as an answer has been provided.