kubespray
kubespray copied to clipboard
Changing loadbalancer FQDN for apiserver
Hi all,
I've deployed a k8s cluster with kubespray v2.18.1 with 3 masters and 4 workers. All is working well BUT the variable apiserver_loadbalancer_domain_name has not been fulfilled on the first deployment so loadbalancer have the default fqdn wihch is lb-apiserver.kubernetes.local.
I want to change this FQDN but I do not find the correct method for.
Any help would be appreciated.
@ccaillet1974 the supported method to change the certificate domain name for kubernetes apiserver is indeed to set apiserver_loadbalancer_domain_name
so please share more details about your setup and how you have set your ansible inventory variables which may explain why it did not propagate correcly.
@cristicalin I'll give you more details :)
I've tried many tests for changing this FQDN and till now no has been worked.
My inventory file is the following one (I don't hide IP because it's a test environment and all IPs are private on non routed subnet without gateway so unable for anyone to access those IPs from internet :) ) :
all:
hosts:
lyo0-k8s-testm00:
ansible_host: 10.128.10.64
ip: 10.141.10.64
access_ip: 10.144.10.64
lyo0-k8s-testm01:
ansible_host: 10.128.10.65
ip: 10.141.10.65
access_ip: 10.144.10.65
lyo0-k8s-testm02:
ansible_host: 10.128.10.66
ip: 10.141.10.66
access_ip: 10.144.10.66
lyo0-k8s-testw00:
ansible_host: 10.128.10.70
ip: 10.141.10.70
access_ip: 10.144.10.70
lyo0-k8s-testw01:
ansible_host: 10.128.10.71
ip: 10.141.10.71
access_ip: 10.144.10.71
lyo0-k8s-testw02:
ansible_host: 10.128.10.72
ip: 10.141.10.72
access_ip: 10.144.10.72
lyo0-k8s-testw03:
ansible_host: 10.128.10.73
ip: 10.141.10.73
access_ip: 10.144.10.73
children:
kube_control_plane:
hosts:
lyo0-k8s-testm00:
lyo0-k8s-testm01:
lyo0-k8s-testm02:
kube_node:
hosts:
lyo0-k8s-testw00:
lyo0-k8s-testw01:
lyo0-k8s-testw02:
lyo0-k8s-testw03:
etcd:
hosts:
lyo0-k8s-testm00:
etcd_address: 10.141.10.64
etcd_access_address: 10.141.10.64
etcd_metrics_port: 2381
lyo0-k8s-testm01:
etcd_address: 10.141.10.65
etcd_access_address: 10.141.10.65
etcd_metrics_port: 2381
lyo0-k8s-testm02:
etcd_address: 10.141.10.66
etcd_access_address: 10.141.10.66
etcd_metrics_port: 2381
k8s_cluster:
children:
kube_control_plane:
kube_node:
calico-rr:
hosts: {}
I've changed only the specified var aka apiserver_loadbalancer_domain_name
. The cluster have been deployed with kubernetes v1.23.5 and cilium v1.10.7. I used containerd as runtime. All hosts are behind proxies for accessing external resources.
The error appears at different level but are always the x509 authentication failed because certs granted for all nodes (masters, workers and default FQDN lb-apiserver.kubernetes.local but not for new FQDN). If you want I could reproduced the different tests and pasting here the errors and the step they occured.
My tests for changing FQDN have been :
- Using upgrade-cluster.yml playbook with the var changed with v2.18.1 and with release-2.19 branch
- Using cluster.yml playbook with the changed var with v2.18.1 and with release-2.19 branch
- Using method described on posts : https://github.com/kubernetes-sigs/kubespray/issues/5464 used for renew certs with v2.18.1
In the meantime I've seen a strange behaviour with crictl and nerdctl with the version installed by the v2.18.1 my containers on a master for example appeared with crictl ps
but not with nerdctl ps
, I've other clusters which have been deployed with the right FQDN and upgraded with release-2.19 to 1.23.7 for kubernetes and cilium 1.11.3 without any problem and this behaviour doesn't occur.
My target is to upgrade from 1.23.5 to 1.24.x with release-2.19 branch and of course changing the APISERVER FQDN :)
PS : sorry for my english ... or my french style :)
Christophe
@ccaillet1974 I don't see where in you inventory vars or group vars you have applied apiserver_loadbalancer_domain_name
note that it should be applied either on the all
group, the k8s_cluster
group or the kube_control_plane
group.
This var is defined on inventory/<inventory_name>/group_vars/all/all.yml
Do you need an extract of this file ?
Hi all,
@cristicalin : I don't uderstand where I need to declare the apiserver_loadbalancer_domain_name var in inventory file. Actually this var is only defined on file : inventory/<inventory_name>/group_vars/all/all.yml
and nowhere else or I miss something.
Could you please help me on this ?
What kind of element you need for help me to fix this behaviour ?
Thanks by advance for your reply
Christophe
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I had the same issue and had to change the host name in the kubconfig file on the master node to the new value of apiserver_loadbalancer_domain_name
to get it working. The old host name is no longer valid because it is replaced in the kube api server certifcate.
File: /etc/kubernetes/admin.conf
Field: clusters.cluster.server