cloudbeat icon indicating copy to clipboard operation
cloudbeat copied to clipboard

[KSPM EKS] Cloudbeat can't read kubelet process

Open amirbenun opened this issue 1 year ago • 1 comments

Background

An alert was triggered for matching 400 findings instead of 411 for KSPM on EKS.

Research

  1. As you can see in the findings index, the 11 missing findings used to come from ip-10-0-3-115.eu-west-1.compute.internal node as cloudbeat analyses its kubelet process but for some reason this time it failed.
  2. When checking cloudbeat logs from the relevant time we can see that cloudbeat failed to read the process from the filesystem: Error running fetcher for key process: open /hostfs/proc/2626/stat: no such file or directory.

Disclaimer

On the next cycle, things got back to normal, cloudbeat properly read and analyzed the kubelet process and generated the missing 11 findings.

Next steps

We don't have enough data to understand what caused this error I suggest seeing if this issue happens again and prioritizing it accordingly.

amirbenun avatar Mar 06 '24 09:03 amirbenun

The process fetcher logs were already enhanced as part of https://github.com/elastic/cloudbeat/pull/1831. Adding a few more improvements that will help to understand the cause on the next time it happens.

  • https://github.com/elastic/cloudbeat/pull/2013

amirbenun avatar Mar 10 '24 16:03 amirbenun