tetragon icon indicating copy to clipboard operation
tetragon copied to clipboard

watcher errors

Open VadimPlekhanov opened this issue 10 months ago • 5 comments

On some nodes I get errors using the tetragon_watcher_errors_total metric. Explain to me what this means. How can I fix this? How critical is this?

VadimPlekhanov avatar Feb 09 '25 10:02 VadimPlekhanov

Maybe this has something to do with the events that I observe on the nodes level=info msg="Namespace to Id map caused eviction" cgid=18168355 id=3807 ns=main

VadimPlekhanov avatar Feb 09 '25 11:02 VadimPlekhanov

Hi,

Thanks for the report. Can you please take a sysdump as described in https://tetragon.io/docs/troubleshooting/#automatic-log--state-collection.

kkourt avatar Feb 10 '25 12:02 kkourt

Hi, thanks. The dump contains a lot of sensitive information. What information is needed to solve the problem? Can I send you an email?

VadimPlekhanov avatar Feb 10 '25 16:02 VadimPlekhanov

@kkourt Can I send you an email?

VadimPlekhanov avatar Feb 12 '25 13:02 VadimPlekhanov

@kkourt Can I send you an email?

In the Cilium OSS project in general and Tetragon specifically, we favor having all communication open (e.g., in GitHub issues) so that all users can benefit from and participate in the discussions. If you need enterprise-level support, I suggest contacting vendors that offer this (see https://tetragon.io/enterprise/).

An alternative would be to remove the sensitive information on your own. One of the first things I would look at would be log messages with severity warn (level=warn) or higher.

For some context, the watcher is the tetragon component that watches the K8s API server for new information, such as the spec of local-running pods.

kkourt avatar Feb 12 '25 14:02 kkourt

I'll close this error as it looks stale. Feel free to reopen.

mtardy avatar Jul 16 '25 09:07 mtardy