troubleshoot icon indicating copy to clipboard operation
troubleshoot copied to clipboard

Kubernetes Events Analyzer

Open diamonwiggins opened this issue 2 years ago • 4 comments

Describe the rationale for the suggested feature.

Often times there are key indicators of issues within Kubernetes events. These events surface issues as it relates to scheduling, lack of resources, readiness timeouts. We should have an Events analyzer where you can specify one or all namespaces, match for text and evaluate it in a Warn/Fail/Pass condition similar to textAnalyze - https://troubleshoot.sh/docs/analyze/regex/

Describe the feature

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: bundle
spec:
  collectors:
    - clusterResources: {}
  analyzers:
    - eventsAnalyze:
        checkName: "insufficient-cpu"
        namespace: my-app
        regex: '.*Insufficient cpu.*'
        outcomes:
          - pass:
              when: "false"
              message: "Sufficient CPU for all Pods in my-app namespace"
          - fail:
              when: "true"
              message: "Insufficient CPU for some Pods in my-app namespace"

Describe alternatives you've considered

Additional context

diamonwiggins avatar Dec 15 '22 15:12 diamonwiggins

I really like this, I wonder if we can make the outcomes even more actionable. Do you see any reasonable way we could tell them which pods the event was attached to for example? The trick here is going to be that events might have to do with more than just pods, but maybe the way events are stored the analyzer could have access to what the event was related to "a node name, a pod name, etc".

chris-sanders avatar Dec 16 '22 20:12 chris-sanders

I think we likely need to be able to parse at least the Reason and Message fields (either separately or together?), possibly the Type field as well but less relevant (Normal vs Warning etc).

CpuID avatar Feb 09 '23 23:02 CpuID

Minor comment: I'd call the analyser events or kubernetesEvents to follow the naming convention of most other analysers

banjoh avatar Feb 07 '24 16:02 banjoh

The PR for this feature is about to be merged - however do still need to document the new analyzer so will re-open this issue.

xavpaice avatar Mar 06 '24 19:03 xavpaice