kube-keepalived-vip service unavailable after pod delete / persistence

service unavailable after pod delete / persistence_timeout harmful?

Open nigoroll opened this issue 3 years ago • 0 comments

I noticed that, after a kubectl delete pod of one pod from a service configured for kube-keepalived-vip, client connections (https) would time out for >>10 minutes even after the pod was brought up again and working correctly (on a new endpoint).

The endpoint is removed correctly from keepalived.conf, yet ipvsadm shows that traffic is still forwarded to the removed endpoint:

# ipvsadm -L
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.249.159.12:80 wlc persistent 1800
  -> 172.30.20.105:80             Masq    1      0          0         
  -> 172.30.215.180:80            Masq    1      0          0         
TCP  10.249.159.12:443 wlc persistent 1800
  -> 172.30.20.105:443            Masq    1      0          0         
  -> 172.30.215.81:443            Masq    1      1          0         <<== removed
  -> 172.30.215.180:443           Masq    1      0          0         

# ipvsadm -Lcn
IPVS connection entries
pro expire state       source             virtual            destination
TCP 29:25  ASSURED     10.249.238.166:0   10.249.159.12:443  172.30.215.81:443
TCP 00:57  SYN_RECV    10.249.238.166:43913 10.249.159.12:443  172.30.215.81:443
TCP 00:26  ASSURED     10.249.159.41:0    10.249.159.12:443  172.30.215.81:443

The simplest way to make things work again is to restart the kube-keepalived-vip container which holds the persisted connections.

The way I read http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.persistent_connection.html it would appear to me that persistence did not work the way keepalived uses it when the config is reloaded and a real server deleted:

With persistent connection, the connection table doesn't clear till the persistence timeout (set with ipvsadm) time after the last client disconnects. This time defaults to about 5mins but can be much longer. Thus you cannot bring down a realserver offering a persistent service, till the persistence timeout has expired - clients who have connected in recently can still reconnect.

I have not (yet) found any viable approach to avoid this issue elegantly.

NOTE This does not happen when the health check goes down, e.g.:

Wed Jul 15 13:03:57 2020: TCP_CHECK on service [172.30.215.185]:tcp:443 failed after 1 retries.
Wed Jul 15 13:03:57 2020: Removing service [172.30.215.185]:tcp:443 to VS [10.249.159.12]:tcp:443

In that case, new connections are being NATed correctly to a surviving endpoint:

# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.249.159.12:80 wlc persistent 1800
  -> 172.30.20.105:80             Masq    1      0          0         
  -> 172.30.215.185:80            Masq    1      0          0         
TCP  10.249.159.12:443 wlc persistent 1800
  -> 172.30.20.105:443            Masq    1      0          5         
# ipvsadm -Lnc
IPVS connection entries
pro expire state       source             virtual            destination
TCP 00:01  CLOSE       10.249.238.166:53628 10.249.159.12:443  172.30.20.105:443
TCP 00:00  CLOSE       10.249.238.166:53627 10.249.159.12:443  172.30.20.105:443
TCP 00:09  CLOSE       10.249.238.166:53632 10.249.159.12:443  172.30.20.105:443
TCP 00:03  CLOSE       10.249.238.166:53629 10.249.159.12:443  172.30.20.105:443
TCP 00:07  CLOSE       10.249.238.166:53631 10.249.159.12:443  172.30.20.105:443
TCP 00:05  CLOSE       10.249.238.166:53630 10.249.159.12:443  172.30.20.105:443
TCP 29:59  ASSURED     10.249.238.166:0   10.249.159.12:443  172.30.20.105:443
TCP 26:54  ASSURED     10.249.238.166:0   10.249.159.12:65535 172.30.215.185:65535

I am also aware that this could be classified as a keepalived issue, yet it is particularly relevant for this project as other options like setting a zero weight for unconfiguration are not directly available.

Jul 15 '20 13:07 nigoroll

kube-keepalived-vip kube-keepalived-vip copied to clipboard

service unavailable after pod delete / persistence_timeout harmful?

kube-keepalived-vip
kube-keepalived-vip copied to clipboard