draino
draino copied to clipboard
Failed schedule deletion seen in draino logs
Info
- Kops Cluster with version
1.16+
running on AWS. -
planetlabs/draino:e0d5277
image is being used.
We see constant msgs in draino logs
2021-02-19T21:34:50.669Z ERROR kubernetes/drainSchedule.go:68 Failed schedule deletion {"key": "ip-10-53-32-128.us-west-2.compute.internal"}
github.com/planetlabs/draino/internal/kubernetes.(*DrainSchedules).DeleteSchedule
/go/src/github.com/planetlabs/draino/internal/kubernetes/drainSchedule.go:68
github.com/planetlabs/draino/internal/kubernetes.(*DrainingResourceEventHandler).OnDelete
/go/src/github.com/planetlabs/draino/internal/kubernetes/eventhandler.go:152
k8s.io/client-go/tools/cache.FilteringResourceEventHandler.OnDelete
/go/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:251
k8s.io/client-go/tools/cache.(*processorListener).run.func1.1
/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:609
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:284
k8s.io/client-go/tools/cache.(*processorListener).run.func1
/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:601
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88
k8s.io/client-go/tools/cache.(*processorListener).run
/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:599
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:71
2021-02-23T11:14:33.146Z ERROR kubernetes/drainSchedule.go:68 Failed schedule deletion {"key": "ip-10-53-31-9.us-west-2.compute.internal"}
github.com/planetlabs/draino/internal/kubernetes.(*DrainSchedules).DeleteSchedule
/go/src/github.com/planetlabs/draino/internal/kubernetes/drainSchedule.go:68
github.com/planetlabs/draino/internal/kubernetes.(*DrainingResourceEventHandler).OnDelete
/go/src/github.com/planetlabs/draino/internal/kubernetes/eventhandler.go:152
k8s.io/client-go/tools/cache.FilteringResourceEventHandler.OnDelete
/go/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:251
k8s.io/client-go/tools/cache.(*processorListener).run.func1.1
/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:609
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:284
k8s.io/client-go/tools/cache.(*processorListener).run.func1
/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:601
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88
k8s.io/client-go/tools/cache.(*processorListener).run
/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:599
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:71
2021-02-23T11:15:47.174Z ERROR kubernetes/drainSchedule.go:68 Failed schedule deletion {"key": "ip-10-53-32-225.us-west-2.compute.internal"}
Not sure if i am using correct image of draino
.
Note: I have deployed node-problem-detector
and cluster-autoscaler
alongside with it already.
I just had some luck by changing the RBAC
- apiGroups: [''] resources: [nodes/status] verbs: [update, patch]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels: {component: draino}
name: draino
rules:
- apiGroups: [apps]
resources: [statefulsets]
verbs: [create, update, get, watch, list]
- apiGroups: ['']
resources: [endpoints]
verbs: [create, update, get, watch, list]
- apiGroups: ['']
resources: [events]
verbs: [create, patch, update]
- apiGroups: ['']
resources: [nodes]
verbs: [get, watch, list, update]
- apiGroups: ['']
resources: [nodes/status]
verbs: [update, patch]
- apiGroups: ['']
resources: [pods]
verbs: [get, watch, list]
- apiGroups: ['']
resources: [pods/eviction]
verbs: [create]
- apiGroups: [extensions]
resources: [daemonsets]
verbs: [get, watch, list]
I have the same errors, granting update
to nodes/status
did not help.
This happens for me when the cluster-autoscaler terminates an instance.
kubernetes: v1.20.5 draino: planetlabs/draino:e0d5277 cluster-autoscaler: v1.20.0
Same error here, will try the suggetion with the RBAC permission!