robusta
robusta copied to clipboard
add the ability to get pods list on k8s events that related to nodes
It will be amazing if you will be able to add the ability to get the pods list on k8s events that related to nodes
We have tried to add the node_running_pods_enricher
action to on_kubernetes_warning_event
, see here:
- triggers:
- on_kubernetes_warning_event: {}
actions:
- event_report: {}
- event_resource_events: {}
- node_running_pods_enricher: {}
But, we are getting the following error from the runner
2023-01-08 09:10:57.603 ERROR Action node_running_pods_enricher cannot be triggered by <class 'robusta.integrations.kubernetes.autogenerated.events.EventChangeEvent'>
2023-01-08 09:10:57.603 ERROR unknown error reloading playbooks. will try again when they next change
Traceback (most recent call last):
File "/app/src/robusta/runner/config_loader.py", line 241, in __reload_playbook_packages
(sinks_registry, playbooks_registry) = self.__prepare_runtime_config(
File "/app/src/robusta/runner/config_loader.py", line 308, in __prepare_runtime_config
playbooks_registry = PlaybooksRegistryImpl(
File "/app/src/robusta/model/config.py", line 151, in __init__
raise Exception(msg)
Exception: Action node_running_pods_enricher cannot be triggered by <class 'robusta.integrations.kubernetes.autogenerated.events.EventChangeEvent'>
\
Any chance to implement that kind of feature?
Thanks for reporting it @dsaydon90
Just to clarify, you'd like to get the list of pods running on the node, whenever there's a warning event on some node? Or is there another scenario for which you'd like to get the pods list?
Hi @arikalon1 , thank you for your response.
Exactly, whenever there's a warning event on some node, to get the running pods on that node.
A good example can be something like: InvalidDiskCapacity Warning for Node None/ip-10-104-70-47.eu-west-1.compute.internal
In this case, it would be nice to have a list of pods running on the node at the same time, so I'll be able to take each pod for further investigation.
Thanks for the explanation @dsaydon90
Did you consider doing it using a prometheus alert on the node? (For example, creating an alert for unhealthy node, and adding the list of pods and node events to it) I'm asking, because warning events are often temporary. When a node starts, or scales down, the are often warning events.
Hi @arikalon1 , In my case, we are using newrelic, that's why we are trying to avoid from installing Prometheus.
Hey @dsaydon90 , we'd like to learn more about the issue before we start development. Can we meet for a short call? You can message me (Tomer) in the Robusta Slack Community
Does it work for you?