k8s-bigip-ctlr icon indicating copy to clipboard operation
k8s-bigip-ctlr copied to clipboard

Controller should support Node Taints for k8s > 1.6

Open Efoi opened this issue 2 years ago • 2 comments

Node taints

The controller should support removal of nodes from appmgr if it is tainted unschedulable

Description

When patching nodes, they are restarted. To make sure that no traffic is sendt to a node that is beeing restarted/deleted, looking at unschedulable taint would give the bigip-controller a head start while the node is beeing drained. It also prevents traffic beeing sendt to a drained node, traffic that is guranteed a second hop.

Actual Problem

Traffic issues when nodes are beeing removed is not a good situation for any production environment. It is frustrating to have bigip updated up to 30 seconds after the nodes has been deleted from k8s. Important traffic may be lost.

Solution Proposed

Update controller to look for and prefer the Node.Spec.Taint setting and then defer to the Node.Spec.Unschedulable setting - if a node selector label has been configured appManager will ignore taint. A flag to enable/disable this check.

Alternatives

Reducing node-poll-interval and verify-interval to a minimum will not prevent traffic lost.

Additional context

This feature request was previously implemented in #321 , but is no longer a part of appmgr.

Efoi avatar Jun 03 '22 07:06 Efoi

@Efoi please can you reach out to me [email protected]

I also received a SR, C3841160 and i want to understand and prioritize the requirements.

mdditt2000 avatar Jun 28 '22 17:06 mdditt2000

@Efoi thank you for the email update. Jira updated and prioritized!

mdditt2000 avatar Jun 29 '22 23:06 mdditt2000

What about when a pod needs to be drained but not the node? Currently if the pod goes unready CIS will just remove that member from the pool. How can we have the option of setting the pool member to "disable" or "force offline"

walkingtub avatar Oct 19 '22 17:10 walkingtub

Hi, has there been any updates regarding this issue?

Efoi avatar Jun 02 '23 07:06 Efoi

Consider this issue with https://github.com/F5Networks/k8s-bigip-ctlr/issues/2965

trinaths avatar Jul 22 '23 19:07 trinaths