terraform-datadog-platform
terraform-datadog-platform copied to clipboard
Change deployment.replicas_ready to deployment.replicas_available as replicas_ready doesn't exist
what
By looking at the datadog kube metrics documentation, it doesn't look like kubernetes_state.deployment.replicas_ready
is a metric that is shipped. https://docs.datadoghq.com/agent/kubernetes/data_collected/#kubernetes-state
As I am new to datadog, I am unsure whether this metric used to exist and was possibly deprecated.
I also cannot find the kubernetes_state.deployment.replicas_ready
metric in my DataDog environment:
why
The (k8s) Deployment Replica Pod is down
monitor is incorrectly reporting my deployment replica pods are down for deployments with >= 2 pods.
references
- Link to any supporting github issues or helpful documentation to add some context (e.g. stackoverflow).
- Use
closes #123
, if this PR closes a GitHub issue#123
/test all
@btai24 I remember having my own issues with this monitor and I believe I just removed usage of it from my client's clusters. I will check their DD account and confirm this is not used, but I'm 👍 on making this happen.
@Gowiem Possible I may not as well in the future as there is overlap with some of the other monitors -- those giving a lot more granular information than this one. I just got this up yesterday so I'll let the monitor sit. I did have to change the query to what's in this PR because I was getting false negatives otherwise.
/rebuild-readme
@btai24 thank you for the PR. The change looks reasonable, but if you look into the "Recommended Monitors" in the Datadog UI, they have the query in question
If we click on the monitor and start searching, we can see kubernetes_state.deployment.replicas_ready
metric
and you can export the monitor:
{
"id": 0,
"name": "[kubernetes] Monitor Kubernetes Deployments Replica Pods",
"type": "query alert",
"query": "avg(last_15m):avg:kubernetes_state.deployment.replicas_desired{*} by {deployment} - avg:kubernetes_state.deployment.replicas_ready{*} by {deployment} >= 2",
"message": "More than one Deployments Replica's pods are down.",
"tags": [
"integration:kubernetes"
],
"options": {
"notify_audit": true,
"locked": false,
"timeout_h": 0,
"new_host_delay": 300,
"require_full_window": false,
"notify_no_data": true,
"renotify_interval": "0",
"escalation_message": "",
"no_data_timeframe": 5,
"include_tags": true,
"thresholds": {
"critical": 2
}
},
"priority": null
}
Maybe not all DD accounts are created equal. Maybe DD made a mistake describing that Recommended Monitor (it's still in Beta).
We'll wait for your confirmation that your changes are working for you and you see the monitor working.
This pull request is now in conflict. Could you fix it @btai24? 🙏
Closing this as currently both replicas_ready
and replicas_available
are available.