beats
beats copied to clipboard
Use fingerprint file identity by default and migrate all existing filestream inputs to it
We started seeing cases where the filestream-monitoring
will log an entry stating the Elastic-Agent log file was truncated:
{
"log.level": "info",
"@timestamp": "2024-07-05T15:42:27.135Z",
"message": "File was truncated. Reading file from offset 0. Path=/var/lib/elastic-agent/data/elastic-agent-8.14.1-1348b9/logs/elastic-agent-20240705-25.ndjson",
"component": {
"binary": "filebeat",
"dataset": "elastic_agent.filebeat",
"id": "filestream-monitoring",
"type": "filestream"
},
"log": {
"source": "filestream-monitoring"
},
"service.name": "filebeat",
"id": "filestream-monitoring-agent",
"path": "/var/lib/elastic-agent/data/elastic-agent-8.14.1-1348b9/logs/elastic-agent-20240705-25.ndjson",
"ecs.version": "1.6.0",
"log.logger": "input.filestream",
"log.origin": {
"file.line": 300,
"file.name": "filestream/input.go",
"function": "github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).openFile"
},
"source_file": "filestream::filestream-monitoring-agent::native::524773-66308",
"state-id": "native::524423-66428"
}
We know the Elastic-Agent does not truncate files, the only reason for Filebeat to detect a truncation is if inodes are re-used.
The best way to prevent inode reuse from affecting the monitoring logs is to use a file identity that won't be re-used, the best option for that is the fingerprint file identity.
We should either start using the fingerprint file identity or at least make it configurable so users facing this issue can work around the inode reuse.
One very important thing to bear in mind is that if we change the file identity, all existing files will be considered new and re-ingested.