draino icon indicating copy to clipboard operation
draino copied to clipboard

Cordoning & Draining all nodes healthy nodes

Open sstarcher opened this issue 5 years ago • 2 comments

I installed draino to test it out and it immeditatly cordoned all of my nodes. After turning on debug the logs are as follows. It was installed via the helm chart in a Kubernetes 1.14.6 cluster.

draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.056Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-1-42.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.056Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-1-42.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.056Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-1-42.us-west-2.compute.internal", "after": "2019-10-15T15:01:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.056Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-6-32.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.056Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-6-32.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.056Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-6-32.us-west-2.compute.internal", "after": "2019-10-15T15:11:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.057Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-8-50.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.057Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-8-50.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.057Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-8-50.us-west-2.compute.internal", "after": "2019-10-15T15:21:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.057Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-11-78.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.057Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-11-78.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.058Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-11-78.us-west-2.compute.internal", "after": "2019-10-15T15:31:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.058Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-2-117.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.058Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-2-117.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.058Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-2-117.us-west-2.compute.internal", "after": "2019-10-15T15:41:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.058Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-6-253.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-6-253.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-6-253.us-west-2.compute.internal", "after": "2019-10-15T15:51:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-1-213.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-1-213.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-1-213.us-west-2.compute.internal", "after": "2019-10-15T16:01:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-3-232.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-3-232.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-3-232.us-west-2.compute.internal", "after": "2019-10-15T16:11:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-5-12.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-5-12.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-5-12.us-west-2.compute.internal", "after": "2019-10-15T16:21:23.937Z"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	DEBUG	kubernetes/eventhandler.go:114	Cordoning	{"node": "ip-10-40-9-88.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:123	Cordoned	{"node": "ip-10-40-9-88.us-west-2.compute.internal"}
draino-558558fbb-d8mlr:draino 2019-10-15T14:51:24.135Z	INFO	kubernetes/eventhandler.go:132	Scheduled drain	{"node": "ip-10-40-9-88.us-west-2.compute.internal", "after": "2019-10-15T16:31:23.937Z"}```

sstarcher avatar Oct 15 '19 14:10 sstarcher

The current version used by the helm chart considers all states to be bad states. Do you have a recommended set of states that we can put in the helm chart?

sstarcher avatar Oct 15 '19 15:10 sstarcher

actually it is not, as seen in here. I am using Draino with node-problem-detector and belows are working fine in my case.

  • KernelDeadlock
  • ReadonlyFilesystem

bilalcaliskan avatar May 23 '21 10:05 bilalcaliskan