beats icon indicating copy to clipboard operation
beats copied to clipboard

Use fingerprint file identity by default and migrate all existing filestream inputs to it

Open belimawr opened this issue 7 months ago • 8 comments

We started seeing cases where the filestream-monitoring will log an entry stating the Elastic-Agent log file was truncated:

{
  "log.level": "info",
  "@timestamp": "2024-07-05T15:42:27.135Z",
  "message": "File was truncated. Reading file from offset 0. Path=/var/lib/elastic-agent/data/elastic-agent-8.14.1-1348b9/logs/elastic-agent-20240705-25.ndjson",
  "component": {
    "binary": "filebeat",
    "dataset": "elastic_agent.filebeat",
    "id": "filestream-monitoring",
    "type": "filestream"
  },
  "log": {
    "source": "filestream-monitoring"
  },
  "service.name": "filebeat",
  "id": "filestream-monitoring-agent",
  "path": "/var/lib/elastic-agent/data/elastic-agent-8.14.1-1348b9/logs/elastic-agent-20240705-25.ndjson",
  "ecs.version": "1.6.0",
  "log.logger": "input.filestream",
  "log.origin": {
    "file.line": 300,
    "file.name": "filestream/input.go",
    "function": "github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).openFile"
  },
  "source_file": "filestream::filestream-monitoring-agent::native::524773-66308",
  "state-id": "native::524423-66428"
}

We know the Elastic-Agent does not truncate files, the only reason for Filebeat to detect a truncation is if inodes are re-used.

The best way to prevent inode reuse from affecting the monitoring logs is to use a file identity that won't be re-used, the best option for that is the fingerprint file identity.

We should either start using the fingerprint file identity or at least make it configurable so users facing this issue can work around the inode reuse.

One very important thing to bear in mind is that if we change the file identity, all existing files will be considered new and re-ingested.

belimawr avatar Jul 08 '24 20:07 belimawr