python icon indicating copy to clipboard operation
python copied to clipboard

IDP certificate issue

Open mariolenz opened this issue 6 months ago • 8 comments

This might be basically the same as #2160, or maybe it's just related. I'm not sure, but since this issue is closed I thought I should open a new one.

We're using TKGI to create and manage Kubernetes clusters. TKGI is also the IDP (OIDC) for the clusters. When I run tkgi get-credentials <cluster-name>, a ~/.kube/config is created which contains the necessary information.

When the access token expires, kubectl is perfectly able to get a new one with the info from the config. But this Python lib isn't. I think the problem is that TKGI only writes the certificat and not the complete chain into idp-certificate-authority-data.

Please let me stress again that kubectl does not have a problem with that!

What happened (please include outputs or screenshots):

$ python
Python 3.12.8 (main, Dec  9 2024, 15:25:01) [GCC 8.5.0 20210514 (Red Hat 8.5.0-22)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from kubernetes import client, config
>>> config.load_config()
Traceback (most recent call last):
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 464, in _make_request
    self._validate_conn(conn)
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 1093, in _validate_conn
    conn.connect()
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connection.py", line 741, in connect
    sock_and_verified = _ssl_wrap_socket_and_match_hostname(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connection.py", line 920, in _ssl_wrap_socket_and_match_hostname
    ssl_sock = ssl_wrap_socket(
               ^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/util/ssl_.py", line 460, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/util/ssl_.py", line 504, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/ssl.py", line 455, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/ssl.py", line 1041, in _create
    self.do_handshake()
  File "/usr/lib64/python3.12/ssl.py", line 1319, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1000)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 488, in _make_request
    raise new_e
urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1000)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/__init__.py", line 42, in load_config
    load_kube_config(**kwargs)
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 826, in load_kube_config
    loader.load_and_set(config)
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 589, in load_and_set
    self._load_authentication()
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 288, in _load_authentication
    if self._load_auth_provider_token():
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 307, in _load_auth_provider_token
    return self._load_oid_token(provider)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 413, in _load_oid_token
    self._refresh_oidc(provider)
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/config/kube_config.py", line 450, in _refresh_oidc
    response = client.request(
               ^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/client/api_client.py", line 373, in request
    return self.rest_client.GET(url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/client/rest.py", line 244, in GET
    return self.request("GET", url,
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/kubernetes/client/rest.py", line 217, in request
    r = self.pool_manager.request(method, url,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/_request_methods.py", line 135, in request
    return self.request_encode_url(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/_request_methods.py", line 182, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/poolmanager.py", line 443, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 871, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 871, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 871, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/connectionpool.py", line 841, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/data/venv-ansible-vmware/lib64/python3.12/site-packages/urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='tkgi-playground.it.nrw.de', port=8443): Max retries exceeded with url: /oauth/token/.well-known/openid-configuration (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1000)')))
>>>

What you expected to happen:

$ python
Python 3.12.8 (main, Dec  9 2024, 15:25:01) [GCC 8.5.0 20210514 (Red Hat 8.5.0-22)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from kubernetes import client, config
>>> config.load_config()
>>>

How to reproduce it (as minimally and precisely as possible): Test with idp-certificate-authority-data containing only the certificat, not the complete chain.

Anything else we need to know?:

When I manually patch idp-certificate-authority-data in ~/.kube/config to contain the complete certificat chain everything works. I think the problem is somewhere around here:

https://github.com/kubernetes-client/python/blob/3e6cc5871997cd52fd8f4c8324894508f3a75136/kubernetes/base/config/kube_config.py#L422-L454

When connecting to TKGI, there's the complete certificat chain. Neither kubectl nor curl nor anything else has a problem there. But this lib has, because it trusts only the certificat from idp-certificate-authority-data, which lacks the intermediate and root CA certificats.

Environment:

  • Kubernetes version (kubectl version):
Client Version: v1.30.7
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.5+vmware.1
  • OS (e.g., MacOS 10.13.6): Red Hat Enterprise Linux release 8.10 (Ootpa)
  • Python version (python --version): 3.12.8
  • Python client version (pip list | grep kubernetes): 32.0.1

mariolenz avatar Jun 18 '25 07:06 mariolenz

Out of curiosity, what should go into idp-certificate-authority-data? From the name, I assume that this should be the certificate (complete certificate chain?) of the CA that signed the certificate of the IDP. And not the certificate of the IDP. Is this correct?

If I'm right, there might be a bug in TKGI and not in your library. But I couldn't find any documentation or specification about this.

I'm planning to also open an issue with TKGI, and it would be really helpful to get a clarification on this. If anyone knows a documentation or specification about what should go into idp-certificate-authority-data, I would be really grateful to get a hint 🙏

mariolenz avatar Jun 23 '25 16:06 mariolenz

I think the problem is that you set idp-certificate-authority-data as the only valid CA. Both kubectl and curl (using --cacert with the certificate from idp-certificate-authority-data) seem work. I think they use it additionally.

Would it be possible to use idp-certificate-authority-data additionally to the systems usual CA certificate store, instead of using it alone? Alternatively, what do you think about a parameter to ignore idp-certificate-authority-data? This would be only a workaround, though.

mariolenz avatar Jul 08 '25 13:07 mariolenz

/help

roycaihw avatar Jul 16 '25 20:07 roycaihw

@roycaihw: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Jul 16 '25 20:07 k8s-ci-robot

@roycaihw I'm not really sure what /help means. Well, as far as I understand it added some labels and made your bot post something. But I'm not 100% sure what should happen now.

Just to make this clear: Do you want any help from me providing more information to better understand the problem (which I probably can do) or does it mean you're asking contributors for help to fix this? I guess it's the latter, but I just want to make sure 👼

mariolenz avatar Jul 17 '25 15:07 mariolenz

Any news on this?

mariolenz avatar Aug 29 '25 16:08 mariolenz

I am facing the same issue. I'll probably switch to golang SDK.

AtitShetty avatar Oct 16 '25 13:10 AtitShetty

With the support of AI, I have added the following to the file kube_config.py above line 560 if self._user: and indented lines 560-567 accordingly. I then ran a test with our Ansible playbook, waiting more than 10 minutes between each task. These tests were all successful. I would appreciate it if someone could check what effects this change has.

JoschuaA4 avatar Nov 05 '25 10:11 JoschuaA4