Failed to join cluster in Pull-Mode
What happened:
- karmada installed in host-cluster and api-server is publicly exposed under
and is reachable and functional (can do kubectl apply etc) - in on-premise cluster (private networking; but access to internet) I try to join the host-cluster:
karmadactl register <domain> --token bsbmb3.<secret> --discovery-token-ca-cert-hash sha256:bb5593ab60547232a812109d87d48d3f16834bb68fa99035e64c8beeea442dd3
[preflight] Running pre-flight checks
[prefligt] All pre-flight checks were passed
[karmada-agent-start] Waiting to perform the TLS Bootstrap
[karmada-agent-start] Waiting to construct karmada-agent kubeconfig
Unable to connect to the server: dial tcp: lookup karmada-apiserver.karmada-system.svc.cluster.local on 127.0.0.53:53: server misbehaving
The join fails.
What you expected to happen:
I expect the member cluster to be added successfully.
Unable to connect to the server: dial tcp: lookup karmada-apiserver.karmada-system.svc.cluster.local on 127.0.0.53:53: server misbehaving
Can you access karmada-apiserver on the machine where running the karmadactl register?
yes, I can access it (via kubectl) from the machine where I'm running karmadactl register.
I'm not sure, why karmadactl even tries to resolve karmada-apiserver.karmada-system.svc.cluster.local on my local DNS, since I specified that karmada-api server must be found at <domain>?
Maybe this helps:
karmadactl register karmada-apiserver.example.com --token qffp4d.<secret> --discovery-token-ca-cert-hash sha256:bb5593ab60547232a812109d87d48d3f16834bb68fa99035e64c8beeea442dd3 --v 3
I1120 09:48:30.999314 16579 register.go:299] Registering cluster. cluster name: default
I1120 09:48:30.999360 16579 register.go:300] Registering cluster. cluster namespace: karmada-cluster
[preflight] Running pre-flight checks
I1120 09:48:30.999377 16579 register.go:422] Validating the existence of file /etc/karmada/bootstrap-karmada-agent.conf
I1120 09:48:30.999391 16579 register.go:422] Validating the existence of file /etc/karmada/karmada-agent.conf
I1120 09:48:30.999402 16579 register.go:422] Validating the existence of file /etc/karmada/pki/ca.crt
[prefligt] All pre-flight checks were passed
[karmada-agent-start] Waiting to perform the TLS Bootstrap
I1120 09:48:31.007058 16579 register.go:765] [discovery] Created cluster-info discovery client, requesting info from "karmada-apiserver.example.com"
I1120 09:48:31.063367 16579 register.go:803] [discovery] Requesting info from "karmada-apiserver.example.com" again to validate TLS against the pinned public key
I1120 09:48:31.111340 16579 register.go:820] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "karmada-apiserver.example.com"
I1120 09:48:31.111377 16579 register.go:437] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I1120 09:48:31.111399 16579 register.go:448] [discovery] writing bootstrap karmada-agent config file at /etc/karmada/bootstrap-karmada-agent.conf
I1120 09:48:31.112644 16579 register.go:457] [discovery] writing CA certificate at /etc/karmada/pki/ca.crt
[karmada-agent-start] Waiting to construct karmada-agent kubeconfig
Unable to connect to the server: dial tcp: lookup karmada-apiserver.karmada-system.svc.cluster.local on 127.0.0.53:53: server misbehaving
I did some further debugging:
karmadaClusterInfo in pgk/karmadactl/register/register.go:340 contains this
{
"LocationOfOrigin": "",
"server": "https://karmada-apiserver.karmada-system.svc.cluster.local:5443",
"certificate-authority-data": "LS0tLS1C..."
}
Shouldn't there be karmada-apiserver.example.com instead?
Okay, I think I found the issue:
- karmada retrieves control plane info using bootstrap token
- this info contains
karmada-apiserver.karmada-system.svc.cluster.localand notkarmada-apiserver.example.com - agent kubeconfig is build with that info
- agent tries to connect with that config and fails.
Can I configure the karmada apiserver such that it will tell bootstrapping clients to use karmada-apiserver.example.com ?
cc @lonelyCZ @chaosi-zju for help
@maaft
This server info is using karmada-apiserver.config, are you using karmada-apiserver.example.com by karmada-apiserver.config.
You can try to the blew steps:
- change
karmada-apiserver.example.comtokarmada-apiserver.example.comfor karmada-apiserver.config - karmadactl create token with karmada-apiserver.config
-
karmadactl registryagain
karmada-apiserver.karmada-system.svc.cluster.local comes from default/cluster-info configmap in the karmada controlplane in bootstrapping TLS certificate (CA certificate of the karmada control plane apiserver).
I think karmada register should not use the endpoint from cluster-info, but used the endpoint(bootstrap endpoint) which user passed to the command.
It is because the endpoint in cluster-info is sometimes unreachable from member cluster as the issue reported. But, it can guarantee bootstrap endpoint can be reachable (generating kubeconfig step is after bootstrapping TLS (CA) certificate).
I think #4562 can be a fix.