aws-node-termination-handler issues

Metric counters don't report 0 to Prometheus

5

NTH does not report 0 when a metric has not been observed during runtime. Prometheus expects counters to report 0 when they are 0. If they are simply not reported,...

risinger

Type: Bug

Priority: Medium

Status: Help Wanted

stalebot-ignore

NTH should issue lifecycle heartbeats

13

I've been using the NTH in queue processor mode. This implementation uses a lifecycle hook associated with the node instance to trigger the NTH to cordon/drain. Lifecycle hooks support two...

jrsherry

Type: Enhancement

Priority: High

Status: Help Wanted

stalebot-ignore

Add custom delay for instance refresh actions

5

When using instance refresh to update ASGs it looks like the events come through with a start date of now which triggers the node-termination handler to start cordoning and draining...

stevehipwell

Type: Enhancement

Priority: Medium

stalebot-ignore

Improve Rebalance Recommendation documentation in README

2

Clarify that the rebalance rec only applies to Spot instances and that it's a separate event type than an AZ rebalance #416

haugenj

Priority: Low

docs

stalebot-ignore

Support helm values validation with json schema

**Describe the feature** It would be nice to add a [values.schema.json](https://helm.sh/docs/faq/changes_since_helm2/#validating-chart-values-with-jsonschema) after the [JSON schema deprecated flag ](http://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.9.3)feature becomes [supported](https://github.com/helm/helm/issues/10732) in Helm (looks like [this PR ](https://github.com/helm/helm/pull/11340)needs to be merged...

pdk27

All retries failed, unable to complete the uncordon after reboot workflow error

**Describe the bug** Hi, In the logs right after the NTH starts we can see errors frequently like below ``` 2022/09/08 08:18:46 ERR Error when trying to list Nodes w/...

sushantsoni5392