[RFE] Boostrap mode
I am trying to join a node to a cluster that requires additional network configuration for the CNI (cilium) to come online. But nmstate-handler tries to reach K8s control plane with the internal service IP.
This leads to the node bootstrap to hit a dead lock: CNI requires additional network setup to start serving traffic (e.g. service ip range), NMState requires Kube API server to configure the network.
Without the additional setup, kubelet can reach the kube api via the cluster external / LB'd address.
I can think of two possible solutions:
- provide a config parameter to set the external kube api server address.
- add an fallback config via volume mount (ConfigMap/Secret) or environment variable so a default config is delivered via kubelet.
Logs:
{"level":"info","ts":"2025-08-02T03:54:44.350Z","logger":"setup","msg":"Try to take exclusive lock on file: /var/k8s_nmstate/handler_lock"}
{"level":"info","ts":"2025-08-02T03:54:44.350Z","logger":"setup","msg":"Successfully took nmstate exclusive lock"}
{"level":"info","ts":"2025-08-02T03:54:44.350Z","logger":"setup","msg":"Creating manager"}
{"level":"error","ts":"2025-08-02T03:55:14.366Z","msg":"Failed to get API Group-Resources","error":"Get \"https://<api-svc-ip>:443/api?timeout=32s\": dial tcp <api-svc-ip>:443: i/o timeout","stacktrace":"sigs.k8s.io/controller-runtime/pkg/cluster.New\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/cluster/cluster.go:161\nsigs.k8s.io/controller-runtime/pkg/manager.New\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go:351\nmain.mainHandler\n\t/go/src/github.com/openshift/kubernetes-nmstate/cmd/handler/main.go:139\nmain.main\n\t/go/src/github.com/openshift/kubernetes-nmstate/cmd/handler/main.go:89\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:271"}
{"level":"error","ts":"2025-08-02T03:55:14.366Z","logger":"setup","msg":"unable to start manager","error":"Get \"https://<api-svc-ip>:443/api?timeout=32s\": dial tcp <api-svc-ip>:443: i/o timeout","stacktrace":"main.mainHandler\n\t/go/src/github.com/openshift/kubernetes-nmstate/cmd/handler/main.go:141\nmain.main\n\t/go/src/github.com/openshift/kubernetes-nmstate/cmd/handler/main.go:89\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:271"}
Yeah, at the moment this is kind of expected behaviour based on the design. I am just not sure if only by allowing to set the external API address we solve the whole problem of "bootstrap mode".
We had similar discussions in the MetalLB and it turned out to be a bit more difficult that it sounds. In either case, I will mark this issue clearly as RFE because it's a good idea, but potentially high dev effort.
Feel free to contribute any work if you had solved this issue