hetzner-kube
hetzner-kube copied to clipboard
feature request: add support for floating ips
Add support to bind "floating ips" to specific nodes (worker/master) to have a "static ip" in case the instance reboots and gets assigned a new ip.
duplicate of https://github.com/xetys/hetzner-kube/issues/13
not exactly. I think that's because of mixing "failover IPs" and "floating IPs". The first issue was about failover IP. But I feel you really mean a floating IP, controlled by some script or stuff in k8s, what assigns floating IP to specific servers. The answer in https://github.com/hetznercloud/hcloud-cloud-controller-manager/issues/6 stated, that you cannot assign Hetzners existing failover IPs (they exist for bare metal for a long time already) to the new cloud. But that is not what we are looking for. We need a script handling floating IPs.
This actually can be possibly achieved using a full k8s setup, if assigning an IP over API to a node is enough. But it will be most likely an on-host keepalived solution, which involved some firewall updates as well. However, this is a notable piece of work, and I'm still convinced, it is actually a ticket for the hcloud-controller-manager, or a distinct project, which gets deployed as a k8s deployment.
An alternative approach would be adding a add floating-ip -n <cluster-name> <ip>
, that involves a keepalived configuration for all worker nodes. In that case, you can add just the floating IP to your domain, and the IPs are assigned to the first working node in the priority list. That's then better for usage with ingress, as this doesn't bring a LoadBalancer
service type into the cluster.
if assigning an IP over API to a node is enough
It's not, you need to configure it on your host, would be too easy otherwise :smile:
https://wiki.hetzner.de/index.php/CloudServer/en#What_are_floating_IPs_and_how_do_they_work.3F
yeah, it was a poor guess. I would prefer the keepalived solution anyway.
I think this a quite an important feature, as for now the cluster isn't really HA, as you can't control it from the outside when the master-node dies that is configured in the kubectl config or am I missing a part?
It would also solve the single point of failure with ingress for people that don't want to use dns loadbalancing/failover.
That's a bit "fifty-fifty". If you install k8s with HA mode enabled, all the master nodes are client-based load balanced. This means, if the major master fails, the cluster itself is still running, and this matters more or the HA experience, than if you have remote access with kubectl. I verified this feature before merging it.
On the other side, currently, the kubeconfig is generated with the first masters IP address. So if this one fails, you won't get any response when using kubectl, and if you just change the IP to the seconds master in "~/.kube/config", it will fail, as the other masters are not part of the cert SAN. This is more an UX issue rather than harming HA. If you log in to master 2 and use the kubectl towards "127.0.0.1:6443", you will get a valid response and can use your cluster.
If you add floating IP, you have to distinguish between a master IP (and the reason you might need it) and worker floating IP. You could assign a floating IP to the masters (as well as adding it to the SAN before you create the cluster), to make kubectl work, even if masters fail. The motivation for worker floating IP is to have one IP, which is switching between edge nodes for ingress stuff.
Valid points. I think we should start with a concept on how we can integrate floating IPs in general. Then we can take a look at the different use cases, but ingress is very neat indeed.
@xetys I would like to use this feature when it's ready, but until then, I'll buy a floating IP right now, update configuration on each master, and update our DNS with this floating IP. Is your feature will be able to use an floating IP already created please?
I did a few tests with keepalived, works really good 🚀
Here is my failover script:
#!/bin/bash
# quit on error
set -e
# Get all informations from the API
# Change the description filtering in the FLOATING_ID and FLOATING_IP part according to your description
export SERVER_ID=$(curl -H "Authorization: Bearer YOURTOKEN" "https://api.hetzner.cloud/v1/servers?name=$HOSTNAME" | jq '.servers[0].id')
export FLOATING_IP_ID=$(curl -H "Authorization: Bearer YOURTOKEN" "https://api.hetzner.cloud/v1/floating_ips" | jq '.floating_ips[] | select(.description=="keepalived")' | jq '.id')
export FLOATING_IP=$(curl -H "Authorization: Bearer YOURTOKEN" "https://api.hetzner.cloud/v1/floating_ips" | jq '.floating_ips[] | select(.description=="keepalived")' | jq -r '.ip')
# Change floating ip in hetzner backend
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer YOURTOKEN" -d "{\"server\":$SERVER_ID}" "https://api.hetzner.cloud/v1/floating_ips/$FLOATING_IP_ID/actions/assign"
# Add the ip adress to the default network
ip addr add $FLOATING_IP/32 dev eth0 || echo "IP was already added to the interface"
Only new dependency is jq
, you can get it by calling apt install jq
. I will try to wrap it up for hetzner-kube. Obv. you don't need to export the commands, but I did it for debugging.
Hello,
just an idea for a workaround until Hetzner provides failover-target's for floating ip's (requires docker swarm).
I am using hetzner cloud to build a X-manager-based docker swarm (currently 3 managers).
I created a service that scales to only one 1 server which is updating the floating IP using the Hetzner-API based on a placement constraints which bind's this service to the node.ManagerStatus.Leader=true only. The script checks the floating ip's target every x seconds and only updates the target, once it's pointing to a different ip then self
summarized: the docker swarm Leader, and only the leader, automatically updates the floating-ip target if needed
Hi all, I created a fork of https://github.com/kubernetes/contrib/tree/master/keepalived-vip and modified it in a way that it can use notify scripts. I wanted did this to solve exactly this problem. I already tested it in one of my development clusters at hetzner and it seems to be working. I committed the example resources I used so you can try it. This is work in progress, and I would appreciate any feedback and testing. Resource files are here: https://github.com/cornelius-keller/contrib/tree/master/keepalived-vip/notify-example-hetzner. Also commented on https://github.com/hetznercloud/hcloud-cloud-controller-manager/issues/6 but I guess here is more appropriate.
Hmmm. I expected a little more exitement, maybe it is not clear what this thing is doing. It is dooing keepalive the kubernetes way, and you don't need anything but kubernetes api to implement it. So what you need to do is fill in your failover ip and credentials in a kubernetes secret
apiVersion: v1
data:
failover-ip: # echo <your failover ip> | base64
hetzner-pass: # echo < your hetzner api pass > | base64
hetzner-user: # echo < your hetzner api user > | base64
kind: Secret
metadata:
name: hetzner-secret-failoverip-1
type: Opaque
You put your notification script in a config map (or use mine)
apiVersion: v1
kind: ConfigMap
metadata:
name: vip-notify
data:
notify.sh: |
#!/bin/bash
ENDSTATE=$3
NAME=$2
TYPE=$1
if [ "$ENDSTATE" == "MASTER" ] ; then
HOST_IP=$(ip route get 8.8.8.8 | awk '{print $7 }')
echo "setting Failover IP: $FAILOVER_IP to Server IP: $HOST_IP"
curl -k -u "$HETZNER_USER:$HETZNER_PASS" https://robot-ws.your-server.de/failover/$FAILOVER_IP -d active_server_ip=$HOST_IP
fi
You also need to put your failover ip again in the original keepalived-vip config map. for now this is a bit of duplication but was the "minimal working" way to implement this. Of course in the final scenario this should point to your nginx-ingress service, not to echoheaders as in the example.
apiVersion: v1
kind: ConfigMap
metadata:
name: vip-configmap
data:
138.201.14.20: default/echoheaders # add your config map here. must map the base64 encoded IP in secrets.yaml
Finally you need to deploy the keepalived controller. In the example I used an ReplicationController, but you can use a deployment or a Deamonset to have it on all nodes aswell:
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-keepalived-vip
labels:
k8s-app: kube-keepalived-vip
spec:
replicas: 1
selector:
k8s-app: kube-keepalived-vip
template:
metadata:
labels:
k8s-app: kube-keepalived-vip
name: kube-keepalived-vip
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- ingress-nginx
topologyKey: kubernetes.io/hostname
hostNetwork: true
serviceAccount: kube-keepalived-vip
containers:
- image: quay.io/cornelius/keepalived-vip:0.11_notify
name: kube-keepalived-vip
imagePullPolicy: Always
securityContext:
privileged: true
volumeMounts:
- mountPath: /lib/modules
name: modules
readOnly: true
- mountPath: /opt/notify
name: notify
# use downward API
env:
- name: HETZNER_USER
valueFrom:
secretKeyRef:
key: hetzner-user
name: hetzner-secret-failoverip-1
- name: HETZNER_PASS
valueFrom:
secretKeyRef:
key: hetzner-pass
name: hetzner-secret-failoverip-1
- name: FAILOVER_IP
valueFrom:
secretKeyRef:
key: failover-ip
name: hetzner-secret-failoverip-1
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: KEEPALIVED_NOTIFY
value: /opt/notify/notify.sh
# to use unicast
args:
- --services-configmap=default/vip-configmap
- --watch-all-namespaces=true
- --use-unicast=true
# unicast uses the ip of the nodes instead of multicast
# this is useful if running in cloud providers (like AWS)
#- --use-unicast=true
volumes:
- hostPath:
path: /lib/modules
name: modules
- configMap:
name: vip-notify
defaultMode: 0744
name: notify
The pod anti affinity is needed In my case because I still have an nignx ingress running with a hostPort mapping. But this will be replaced by the keepalive version soon.
So what happens: I tested the grade of HA that this provides by scaling the replication controller to 3 replicas. If I enter one of the pods and look at the keepalived config I see something like this:
root@devcluster06:/# cat /etc/keepalived/keepalived.conf
global_defs {
vrrp_version 3
vrrp_iptables KUBE-KEEPALIVED-VIP
}
vrrp_instance vips {
state BACKUP
interface enp4s0
virtual_router_id 50
priority 106
nopreempt
advert_int 1
track_interface {
enp4s0
}
notify /opt/notify/notify.sh
unicast_src_ip 94.130.34.213
unicast_peer {
138.201.37.92
138.201.52.38
144.76.138.212
144.76.223.202
144.76.223.203
46.4.114.60
}
virtual_ipaddress {
138.201.14.20
}
}
# Service: default/echoheaders
virtual_server 138.201.14.20 80 {
delay_loop 5
lvs_sched wlc
lvs_method NAT
persistence_timeout 1800
protocol TCP
real_server 10.233.88.58 8080 {
weight 1
TCP_CHECK {
connect_port 8080
connect_timeout 3
}
}
}
Now I had a curl request to the IP running in an endless loop, printing the echoed headers to the console. When I kill one of the pods in the replicaset there is a pause of 15-20 seconds in the output. Then the ip is switched to another node and it continues. Same shuould happen if the node currently holding the virtualip dies or reboots. I think this is the fastest failover you get with hetzner and k8s, especially it is much faster then what I used before. Before that I had a single pod failover IP controller which was scheduling via affinity to one node where also nginx was running with a hostPort mapping. So when this node with this pod died, it took a while for k8s to reschedule it to another node, up to 5 minutes. I hope this a bit more detailed explanation helps you to understand what I wanted to achieve and maybe you give it a try. Please note that I am neither a user of hetzner kube nor an expert with keepalived. I am running k8s clusters on hetzner since more then two years on hetzner on bare metal and the failover IP problem was until now one I did not solve to my satisfaction. HTH
@cornelius-keller so, I am very excited 🥇 I cant wait to see it in my production env wrapped in a helm chart. :-) Thank you!
@cornelius-keller It's me again. I made a helm chart of the informations you delivered, but I hang on some (newbie) problems: Maybe you (or someone else) wanna try to help get that running?
Simply fork it
Change the Values.yaml to your needs.
helm install --name hetzner-failover hetzner-failover
https://github.com/exocode/helm-charts/
This is where I struggle at the moment:
kubectl describe replicationcontroller/kube-keepalived-vip
Name: kube-keepalived-vip
Namespace: default
Selector: k8s-app=kube-keepalived-vip
Labels: k8s-app=kube-keepalived-vip
Annotations: <none>
Replicas: 0 current / 1 desired
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: k8s-app=kube-keepalived-vip
name=kube-keepalived-vip
Service Account: kube-keepalived-vip
Containers:
kube-keepalived-vip:
Image: quay.io/cornelius/keepalived-vip:0.11_notify
Port: <none>
Host Port: <none>
Args:
--services-configmap=default/vip-configmap
--watch-all-namespaces=true
--use-unicast=true
Environment:
HETZNER_USER: <set to the key 'hetzner-user' in secret 'hetzner-secret-failoverip-1'> Optional: false
HETZNER_PASS: <set to the key 'hetzner-pass' in secret 'hetzner-secret-failoverip-1'> Optional: false
FAILOVER_IP: <set to the key 'failover-ip' in secret 'hetzner-secret-failoverip-1'> Optional: false
POD_NAME: (v1:metadata.name)
POD_NAMESPACE: (v1:metadata.namespace)
KEEPALIVED_NOTIFY: /opt/notify/notify.sh
Mounts:
/lib/modules from modules (ro)
/opt/notify from notify (rw)
Volumes:
modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
notify:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: vip-notify
Optional: false
Conditions:
Type Status Reason
---- ------ ------
ReplicaFailure True FailedCreate
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 5m (x18 over 15m) replication-controller Error creating: pods "kube-keepalived-vip-" is forbidden: error looking up service account default/kube-keepalived-vip: serviceaccount "kube-keepalived-vip" not found
Hi @exocode The Service account used for rbac was missing in my quick howto. apologizes. I created a pr for you including a service account.
Hi @cornelius-keller
I am still hanging but I had a slight progress... Maybe you would help me to finish that chart?
CrashLoopBackOff: Back-off 1m20s restarting failed container=kube-keepalived-vip pod=kube-keepalived-vip-42rjk_default(275cade9-6192-11e8-9a63-9600000ae6f9)
The Container crashes. When I act quickly after installing with helm I can see the keepalived.conf, which seems not so complete like your example:
kubectl exec kube-keepalived-vip-42rjk -it -- cat /etc/keepalived/keepalived.conf
global_defs {
vrrp_version 3
vrrp_iptables KUBE-KEEPALIVED-VIP
}
vrrp_instance vips {
state BACKUP
interface eth0
virtual_router_id 50
priority 109
nopreempt
advert_int 1
track_interface {
eth0
}
notify /opt/notify/notify.sh
unicast_src_ip 88.99.15.132
unicast_peer {
138.201.152.58
138.201.155.184
138.201.188.206
138.201.188.50
78.46.152.230
78.47.135.218
78.47.197.112
88.198.148.214
88.198.150.193
}
virtual_ipaddress {
}
}
this is the pod log:
kubectl logs kube-keepalived-vip-42rjk kube-keepalived-vip
Sun May 27 09:45:22 2018: Starting Keepalived v1.4.2 (unknown)
Sun May 27 09:45:22 2018: WARNING - keepalived was build for newer Linux 4.4.117, running on Linux 4.4.0-127-generic #153-Ubuntu SMP Sat May 19 10:58:46 UTC 2018
Sun May 27 09:45:22 2018: Opening file '/etc/keepalived/keepalived.conf'.
Sun May 27 09:45:22 2018: Starting Healthcheck child process, pid=21
Sun May 27 09:45:22 2018: Starting VRRP child process, pid=22
Sun May 27 09:45:22 2018: Opening file '/etc/keepalived/keepalived.conf'.
Sun May 27 09:45:22 2018: Netlink: error: message truncated
Sun May 27 09:45:22 2018: Registering Kernel netlink reflector
Sun May 27 09:45:22 2018: Registering Kernel netlink command channel
Sun May 27 09:45:22 2018: Registering gratuitous ARP shared channel
Sun May 27 09:45:22 2018: Opening file '/etc/keepalived/keepalived.conf'.
Sun May 27 09:45:22 2018: Using LinkWatch kernel netlink reflector...
Sun May 27 09:45:23 2018: Opening file '/etc/keepalived/keepalived.conf'.
Sun May 27 09:45:23 2018: Got SIGHUP, reloading checker configuration
Sun May 27 09:45:23 2018: Opening file '/etc/keepalived/keepalived.conf'.
Sun May 27 09:45:23 2018: Netlink: error: message truncated
Sun May 27 09:45:23 2018: Registering Kernel netlink reflector
Sun May 27 09:45:23 2018: Registering Kernel netlink command channel
Sun May 27 09:45:23 2018: Registering gratuitous ARP shared channel
Sun May 27 09:45:23 2018: Opening file '/etc/keepalived/keepalived.conf'.
Sun May 27 09:45:23 2018: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Sun May 27 09:45:23 2018: (vips): No VIP specified; at least one is required
Sun May 27 09:45:24 2018: Stopped
Sun May 27 09:45:24 2018: Keepalived_vrrp exited with permanent error CONFIG. Terminating
Sun May 27 09:45:24 2018: Stopping
Sun May 27 09:45:24 2018: Stopped
Sun May 27 09:45:29 2018: Stopped Keepalived v1.4.2 (unknown)
kubectl describe pod kube-keepalived-vip-42rjk kube-keepalived-vip
Name: kube-keepalived-vip-42rjk
Namespace: default
Node: cluster-worker-06/88.99.15.132
Start Time: Sun, 27 May 2018 11:41:38 +0200
Labels: k8s-app=kube-keepalived-vip
name=kube-keepalived-vip
Annotations: <none>
Status: Running
IP: 88.99.15.132
Controlled By: ReplicationController/kube-keepalived-vip
Containers:
kube-keepalived-vip:
Container ID: docker://ba60194851d51c02f22fe311e284164af32135a67b71b00f5781d50b03f47616
Image: quay.io/cornelius/keepalived-vip:0.11_notify
Image ID: docker-pullable://quay.io/cornelius/keepalived-vip@sha256:3fea1c570775366dee56f0da6acdf412f257ee9c521069e7e0fc9a49256949e3
Port: <none>
Host Port: <none>
Args:
--services-configmap=default/vip-configmap
--watch-all-namespaces=true
--use-unicast=true
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 27 May 2018 11:48:16 +0200
Finished: Sun, 27 May 2018 11:48:23 +0200
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 27 May 2018 11:45:22 +0200
Finished: Sun, 27 May 2018 11:45:29 +0200
Ready: False
Restart Count: 6
Environment:
HETZNER_USER: <set to the key 'hetzner-user' in secret 'hetzner-secret-failoverip-1'> Optional: false
HETZNER_PASS: <set to the key 'hetzner-pass' in secret 'hetzner-secret-failoverip-1'> Optional: false
HETZNER_TOKEN: <set to the key 'hetzner-token' in secret 'hetzner-secret-failoverip-1'> Optional: false
FLOATING_IP: <set to the key 'floating-ip' in secret 'hetzner-secret-failoverip-1'> Optional: false
POD_NAME: kube-keepalived-vip-42rjk (v1:metadata.name)
POD_NAMESPACE: default (v1:metadata.namespace)
KEEPALIVED_NOTIFY: /opt/notify/notify.sh
Mounts:
/lib/modules from modules (ro)
/opt/notify from notify (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-keepalived-vip-token-hkl5m (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
notify:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: vip-notify
Optional: false
kube-keepalived-vip-token-hkl5m:
Type: Secret (a volume populated by a Secret)
SecretName: kube-keepalived-vip-token-hkl5m
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m default-scheduler Successfully assigned kube-keepalived-vip-42rjk to cluster-worker-06
Normal SuccessfulMountVolume 6m kubelet, cluster-worker-06 MountVolume.SetUp succeeded for volume "modules"
Normal SuccessfulMountVolume 6m kubelet, cluster-worker-06 MountVolume.SetUp succeeded for volume "notify"
Normal SuccessfulMountVolume 6m kubelet, cluster-worker-06 MountVolume.SetUp succeeded for volume "kube-keepalived-vip-token-hkl5m"
Normal Pulling 5m (x4 over 6m) kubelet, cluster-worker-06 pulling image "quay.io/cornelius/keepalived-vip:0.11_notify"
Normal Pulled 5m (x4 over 6m) kubelet, cluster-worker-06 Successfully pulled image "quay.io/cornelius/keepalived-vip:0.11_notify"
Normal Created 5m (x4 over 6m) kubelet, cluster-worker-06 Created container
Normal Started 5m (x4 over 6m) kubelet, cluster-worker-06 Started container
Warning BackOff 1m (x19 over 6m) kubelet, cluster-worker-06 Back-off restarting failed container
Error from server (NotFound): pods "kube-keepalived-vip" not found
And sorry: what do you mean by that?:
# add your config map here. must map the base64 encoded IP in secrets.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: vip-configmap
data:
138.201.14.20: default/echoheaders # add your config map here. must map the base64 encoded IP in secrets.yaml
The most recent helm chart is here:
https://github.com/exocode/helm-charts/tree/master/hetzner-failover
Hi @exocode
138.201.14.20: default/echoheaders # add your config map here. must map the base64 encoded IP in secrets.yaml
This is a typo. It should men put your failover Ip here
:
Basically you should put the the failover IP there as a key, and then as the k8s service it should point to.
In this case it is the echo-headers
service in the default
namespace.
I will setup my hetzner-kube cluster tonight to be able to really test your chart and provide better feedback.
~~I'm interested in how to add the Floating-IP to the default network interface of the node. I think the best way would be on setup as I'm not sure if there is any way from inside a container.~~
NVM, found out about it in the docker documentation
I got it working, thank you for your patience! My fault. I did not point to the correct ingress. I had "nginx" but in the real one was "nginx-demo". So I changed that: and it worked, yay! A big step for me in my journey through the awesome world of k8s and Hetzner.
Everything looks nice. May someone (@cornelius-keller :-D ) can add the "filter option for the description" field? Like in the example of https://github.com/xetys/hetzner-kube/issues/58#issuecomment-375089121 ?
In the meantime I found out, that there are already charts out there. This is an interesting one: https://github.com/munnerz/keepalived-cloud-provider because it implements the kubernetes cloud-controller-manager
and Hetzner has some hcloud thing, which may is also an way to achieve that task (https://github.com/hetznercloud/hcloud-cloud-controller-manager), I hope I understood that all correct...
I tried it today, seems to work but somehow I can't route any traffic as the requests timed out.
The IP is assigned to the right Node in the Hetzner backend. On the Node I can also find to IP in the network configuration
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether xx:00:00:xx:xx:bb brd ff:ff:ff:ff:ff:ff
inet 88.99.xx.xx/32 brd 88.99.xx.xx scope global eth0
valid_lft forever preferred_lft forever
<!-- Floating IP -->
inet 78.46.231.xxx/32 scope global eth0
valid_lft forever preferred_lft forever
<!-- / Floating IP -->
Any hints? I'm using floating-ips.
I could narrow down the problem to the ingress-nginx helm chart..
I got it working with the ingress-nginx helm chart, instead of using the hostnetwork one can pass an external ip
helm install stable/nginx-ingress --name ingress --namespace ingress --set rbac.create=true,controller.kind=DaemonSet,controller.service.type=ClusterIP,controller.service.externalIPs.[0]=YOUR_FLOATING_IP,controller.stats.enabled=true,controller.metrics.enabled=true
Then ingress will be bound to this ip. Otherwise the Firewall will not allow connections.
@JohnnyQQQQ I got ingress working via
helm install stable/nginx-ingress --name ingress --namespace ingress --set rbac.create=true,controller.kind=DaemonSet,controller.service.type=ClusterIP,controller.service.externalIPs='{1.2.3.4,5.6.7.8}',controller.stats.enabled=true,controller.metrics.enabled=true
not a .[0]
@cornelius-keller what about using aledbf/kube-keepalived-vip as base image, it looks like it supports multiple IPs per vip-configmap
@exocode @cornelius-keller After deploying hetzner-keepalived-vip for some reason containers crashes with the following error
F0806 11:55:18.222653 1 controller.go:314] Error getting POD information: timed out waiting to observe own status as Running
goroutine 1 [running]:
k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog.stacks(0xc4202a0100, 0xc42012e1c0, 0x83, 0xd1)
/home/jck/go/src/k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog/glog.go:766 +0xcf
k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog.(*loggingT).output(0x1d5bc80, 0xc400000003, 0xc420422a50, 0x1cdb4fa, 0xd, 0x13a, 0x0)
/home/jck/go/src/k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog/glog.go:717 +0x30f
k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog.(*loggingT).printf(0x1d5bc80, 0x3, 0x1479b89, 0x21, 0xc4206bdcd8, 0x1, 0x1)
/home/jck/go/src/k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog/glog.go:655 +0x14b
k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog.Fatalf(0x1479b89, 0x21, 0xc4206bdcd8, 0x1, 0x1)
/home/jck/go/src/k8s.io/contrib/keepalived-vip/vendor/github.com/golang/glog/glog.go:1145 +0x67
main.newIPVSController(0xc4202024e0, 0x0, 0x0, 0x1, 0x7ffd7dbb4c5b, 0xd, 0x32, 0x3, 0x6)
/home/jck/go/src/k8s.io/contrib/keepalived-vip/controller.go:314 +0x229
main.main()
/home/jck/go/src/k8s.io/contrib/keepalived-vip/main.go:127 +0x468
Somebody knows how to solve this?
@JohnnyQQQQ referring to your comment https://github.com/xetys/hetzner-kube/issues/58#issuecomment-396896815, how did you do it since you're running the ingress as a DaemonSet? Did you assign the floating IP to all the workers where ingress is being run beforehand (I'm not even sure if the same floating IP can be assigned to different nodes at the same time)? Or how this should be done?
And when adding a new worker node to an existing cluster, one would need to manually assign the floating IP as well? Would you be using your keepalived script as shown in https://github.com/xetys/hetzner-kube/issues/58#issuecomment-375089121 in case one can't assign the same floating IP address to multiple nodes at the same time?
@kaosmonk You can try hetzner-failover-ip helm-chart, example NodeSelector with multiple IPs and nginx-ingress.
@voron can you share some more insights into replicaCount
param? It depends on the number of edge routers but it's not clear to me what these are? Are these all worker nodes? Since nginx-ingress is being run as DaemonSet I'd suspect that. Or am I on the wrong track here?
In a case where I have 3 worker nodes in my cluster and ingress is running on each, I'd say I'd need to set replicaCount
to 3?
@kaosmonk it's a number of keepalived pods to be spawned across k8s nodes. I don't see the reason to use any other value than 2 . If you need faster floating IP switch in case of simultaneous downtime of 2 k8s nodes, both with keepalived pod - there may be a reason to increase this number. But I can just wait in this rare case till k8s detects nodes downtime and re-schedules both keepalived pods to alive nodes, and then one of these keepalived pods will move floating IP to it's node.
Gotcha, I appreciate the explanation!
@cornelius-keller Good job !
Did you try to PR the add of the notify script in the original repo (kubernetes/contrib) ?