hcloud-cloud-controller-manager
hcloud-cloud-controller-manager copied to clipboard
Robot server with cloud load balancer over vswitch| hcloud-ccm CrashLoopBackOff
TL;DR
We are trying to setup a K3s cluster on dedicated hetzner servers (AX42) with cloud load balancer support over the vswitch.
Expected behavior
A running hcloud-cloud-controller-manager with lb support over vswitch.
Observed behavior
As soon as we add the following environment variables from your How to attach load balancers to Robot private IPs documentation the ccm pod crashes with the message:
│ F0513 15:12:26.452494 1 main.go:62] Cloud provider could not be initialized: hcloud/newCloud: checking if server is in Network not possible: serverIsAttachedToNetwork: Get "http://169.254.169.254/hetzner/v1/metadata/private-networks": context deadline exceeded (Client.Timeout exceeded while awaiting h │
helm-values.yaml
env:
HCLOUD_NETWORK:
valueFrom:
secretKeyRef:
name: hcloud
key: network
HCLOUD_NETWORK_ROUTES_ENABLED:
value: "false"
Minimal working example
k3s server (version v1.32.4+k3s1 ) with the following config:
node-ip: "{{ ansible_facts['enp6s0.4000']['ipv4']['address'] }}"
flannel-backend: 'none'
protect-kernel-defaults: true
secrets-encryption: true
kube-apiserver-arg:
- 'audit-log-path=/var/lib/rancher/k3s/server/logs/audit.log'
- 'audit-policy-file=/var/lib/rancher/k3s/server/audit.yaml'
- 'audit-log-maxage=30'
- 'audit-log-maxbackup=10'
- 'audit-log-maxsize=100'
- 'enable-admission-plugins=NodeRestriction,EventRateLimit'
- 'admission-control-config-file=/var/lib/rancher/k3s/server/psa.yaml'
kubelet-arg:
- 'cloud-provider=external'
- 'streaming-connection-idle-timeout=5m'
- "tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,\
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,\
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,\
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,\
TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,\
TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305"
kube-controller-manager-arg: 'terminated-pod-gc-threshold=10'
disable-network-policy: true
disable-kube-proxy: true
disable-cloud-controller: true
disable:
- 'servicelb'
- 'traefik'
- 'local-storage'
Cilium helm chart (version 1.17.3)
cilium-helm-values.yaml
operator:
rollOutPods: true
nodeSelector:
'node-role.kubernetes.io/control-plane': "true"
hostFirewall:
enabled: false
rollOutCiliumPods: true
encryption: {
enabled: true
type: 'wireguard'
nodeEncryption: true
k8sServiceHost: "127.0.0.1"
k8sServicePort: 6444
kubeProxyReplacement: true
kubeProxyReplacementHealthzBindAddr: "0.0.0.0:10256"
k8sClientRateLimit:
qps: 50
burst: 200
hcloud ccm (version 1.24.0)
hcloud-ccm-helm-values.yaml
robot:
enabled: true
networking:
enabled: false
env:
HCLOUD_NETWORK:
valueFrom:
secretKeyRef:
name: hcloud
key: network
HCLOUD_NETWORK_ROUTES_ENABLED:
value: "false"
HCLOUD_TOKEN:
valueFrom:
secretKeyRef:
name: hcloud
key: token
ROBOT_USER:
valueFrom:
secretKeyRef:
name: hcloud
key: robot-user
optional: true
ROBOT_PASSWORD:
valueFrom:
secretKeyRef:
name: hcloud
key: robot-password
optional: true
Log output
Flag --allow-untagged-cloud has been deprecated, This flag is deprecated and will be removed in a future release. A cluster-id will be required on cloud instances.
I0513 15:26:23.546492 1 serving.go:386] Generated self-signed cert in-memory
W0513 15:26:23.546549 1 client_config.go:667] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0513 15:26:23.738562 1 metrics.go:67] Starting metrics server at :8233
F0513 15:26:28.871713 1 main.go:62] Cloud provider could not be initialized: hcloud/newCloud: checking if server is in Network not possible: serverIsAttachedToNetwork: Get "http://169.254.169.254/hetzner/v1/metadata/private-networks": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Additional information
If we set the following environment variable that we found in the golang code, the hcloud-ccm starts.
env:
HCLOUD_NETWORK_DISABLE_ATTACHED_CHECK:
value: "true"
If this is expected behavior, then it would be nice if this would be included in the documentation.