IDP certificate issue
This might be basically the same as #2160, or maybe it's just related. I'm not sure, but since this issue is closed I thought I should open a new one.
We're using TKGI to create and manage Kubernetes clusters. TKGI is also the IDP (OIDC) for the clusters. When I run tkgi get-credentials <cluster-name>, a ~/.kube/config is created which contains the necessary information.
When the access token expires, kubectl is perfectly able to get a new one with the info from the config. But this Python lib isn't. I think the problem is that TKGI only writes the certificat and not the complete chain into idp-certificate-authority-data.
Please let me stress again that kubectl does not have a problem with that!
What happened (please include outputs or screenshots):
$ python
Python 3.12.8 (main, Dec 9 2024, 15:25:01) [GCC 8.5.0 20210514 (Red Hat 8.5.0-22)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from kubernetes import client, config
>>> config.load_config()
Traceback (most recent call last):
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 464, in _make_request
self._validate_conn(conn)
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 1093, in _validate_conn
conn.connect()
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connection.py", line 741, in connect
sock_and_verified = _ssl_wrap_socket_and_match_hostname(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connection.py", line 920, in _ssl_wrap_socket_and_match_hostname
ssl_sock = ssl_wrap_socket(
^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/util/ssl_.py", line 460, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/util/ssl_.py", line 504, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/ssl.py", line 455, in wrap_socket
return self.sslsocket_class._create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/ssl.py", line 1041, in _create
self.do_handshake()
File "/usr/lib64/python3.12/ssl.py", line 1319, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1000)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 787, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 488, in _make_request
raise new_e
urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1000)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/__init__.py", line 42, in load_config
load_kube_config(**kwargs)
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 826, in load_kube_config
loader.load_and_set(config)
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 589, in load_and_set
self._load_authentication()
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 288, in _load_authentication
if self._load_auth_provider_token():
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 307, in _load_auth_provider_token
return self._load_oid_token(provider)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 413, in _load_oid_token
self._refresh_oidc(provider)
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 450, in _refresh_oidc
response = client.request(
^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/client/api_client.py", line 373, in request
return self.rest_client.GET(url,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/client/rest.py", line 244, in GET
return self.request("GET", url,
^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/client/rest.py", line 217, in request
r = self.pool_manager.request(method, url,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/_request_methods.py", line 135, in request
return self.request_encode_url(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/_request_methods.py", line 182, in request_encode_url
return self.urlopen(method, url, **extra_kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/poolmanager.py", line 443, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 871, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 871, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 871, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 841, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='tkgi-playground.it.nrw.de', port=8443): Max retries exceeded with url: /oauth/token/.well-known/openid-configuration (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1000)')))
>>>
What you expected to happen:
$ python
Python 3.12.8 (main, Dec 9 2024, 15:25:01) [GCC 8.5.0 20210514 (Red Hat 8.5.0-22)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from kubernetes import client, config
>>> config.load_config()
>>>
How to reproduce it (as minimally and precisely as possible):
Test with idp-certificate-authority-data containing only the certificat, not the complete chain.
Anything else we need to know?:
When I manually patch idp-certificate-authority-data in ~/.kube/config to contain the complete certificat chain everything works. I think the problem is somewhere around here:
https://github.com/kubernetes-client/python/blob/3e6cc5871997cd52fd8f4c8324894508f3a75136/kubernetes/base/config/kube_config.py#L422-L454
When connecting to TKGI, there's the complete certificat chain. Neither kubectl nor curl nor anything else has a problem there. But this lib has, because it trusts only the certificat from idp-certificate-authority-data, which lacks the intermediate and root CA certificats.
Environment:
- Kubernetes version (
kubectl version):
Client Version: v1.30.7
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.5+vmware.1
- OS (e.g., MacOS 10.13.6): Red Hat Enterprise Linux release 8.10 (Ootpa)
- Python version (
python --version): 3.12.8 - Python client version (
pip list | grep kubernetes): 32.0.1
Out of curiosity, what should go into idp-certificate-authority-data? From the name, I assume that this should be the certificate (complete certificate chain?) of the CA that signed the certificate of the IDP. And not the certificate of the IDP. Is this correct?
If I'm right, there might be a bug in TKGI and not in your library. But I couldn't find any documentation or specification about this.
I'm planning to also open an issue with TKGI, and it would be really helpful to get a clarification on this. If anyone knows a documentation or specification about what should go into idp-certificate-authority-data, I would be really grateful to get a hint 🙏
I think the problem is that you set idp-certificate-authority-data as the only valid CA. Both kubectl and curl (using --cacert with the certificate from idp-certificate-authority-data) seem work. I think they use it additionally.
Would it be possible to use idp-certificate-authority-data additionally to the systems usual CA certificate store, instead of using it alone? Alternatively, what do you think about a parameter to ignore idp-certificate-authority-data? This would be only a workaround, though.
/help
@roycaihw: This request has been marked as needing help from a contributor.
Guidelines
Please ensure that the issue body includes answers to the following questions:
- Why are we solving this issue?
- To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
- How can the assignee reach out to you for help?
For more details on the requirements of such an issue, please see here and ensure that they are met.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.
In response to this:
/help
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
@roycaihw I'm not really sure what /help means. Well, as far as I understand it added some labels and made your bot post something. But I'm not 100% sure what should happen now.
Just to make this clear: Do you want any help from me providing more information to better understand the problem (which I probably can do) or does it mean you're asking contributors for help to fix this? I guess it's the latter, but I just want to make sure 👼
Any news on this?
I am facing the same issue. I'll probably switch to golang SDK.
With the support of AI, I have added the following to the file kube_config.py above line 560
if self._user:
and indented lines 560-567 accordingly.
I then ran a test with our Ansible playbook, waiting more than 10 minutes between each task. These tests were all successful.
I would appreciate it if someone could check what effects this change has.