`Handshake status 401 Unauthorized` during connect_get_namespaced_pod_exec when using service account
What happened (please include outputs or screenshots):
I would like to use kubernetes-client to exec into a pod container. This script is to run from within another pod, so I need to work with a service account. For testing I run the script from my workstation command line, using my standard kubernetes (admin) user. So the idea is to switch to the service account from within the script (unconditionally for test purposes). This seems to work. But when I actually exec into a container, I receive an exception:
Traceback (most recent call last):
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/stream/ws_client.py", line 528, in websocket_call
client = WSClient(configuration, url, headers, capture_all, binary=binary)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/stream/ws_client.py", line 68, in __init__
self.sock = create_websocket(configuration, url, headers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/stream/ws_client.py", line 494, in create_websocket
websocket.connect(url, **connect_opt)
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/websocket/_core.py", line 261, in connect
self.handshake_response = handshake(self.sock, url, *addrs, **options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/websocket/_handshake.py", line 65, in handshake
status, resp = _get_resp_headers(sock)
^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/websocket/_handshake.py", line 150, in _get_resp_headers
raise WebSocketBadStatusException(
websocket._exceptions.WebSocketBadStatusException: Handshake status 401 Unauthorized -+-+- {'audit-id': 'cd85f343-bdb9-4e94-808c-26dd49a34d1f', 'cache-control': 'no-cache, private', 'content-type': 'application/json', 'date': 'Wed, 09 Oct 2024 13:38:29 GMT', 'content-length': '129'} -+-+- b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}\n'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspaces/misc/python-client-test/python-client-test.py", line 87, in <module>
main()
File "/workspaces/misc/python-client-test/python-client-test.py", line 75, in main
resp = stream(core_api.connect_get_namespaced_pod_exec,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/stream/stream.py", line 36, in _websocket_request
out = api_method(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api/core_v1_api.py", line 994, in connect_get_namespaced_pod_exec
return self.connect_get_namespaced_pod_exec_with_http_info(name, namespace, **kwargs) # noqa: E501
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api/core_v1_api.py", line 1101, in connect_get_namespaced_pod_exec_with_http_info
return self.api_client.call_api(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/stream/ws_client.py", line 538, in websocket_call
raise ApiException(status=0, reason=str(e))
kubernetes.client.exceptions.ApiException: (0)
Reason: Handshake status 401 Unauthorized -+-+- {'audit-id': 'cd85f343-bdb9-4e94-808c-26dd49a34d1f', 'cache-control': 'no-cache, private', 'content-type': 'application/json', 'date': 'Wed, 09 Oct 2024 13:38:29 GMT', 'content-length': '129'} -+-+- b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}\
I've executed some more tests and figured out:
- I can exec into the container with the standard user, without changing to the service account first
- Using kubectl with impersonation of the service account also works -- so RBAC seems to be correct
- Switching to the service user and running other API calls than
execalso works fine
I conclude: there seems to be an issue during protocol handover from REST to Web Socket (this is also suggested by the call stack).
Did I do something wrong? Is this a bug? Is there a workaround?
What you expected to happen:
I expected to be able to successfully exec into a container.
How to reproduce it (as minimally and precisely as possible):
This is my kubernetes test environment (use kubectl apply -f):
apiVersion: v1
kind: Namespace
metadata:
name: test-ns
---
apiVersion: v1
kind: Pod
metadata:
name: test-pod
namespace: test-ns
spec:
containers:
- name: test-container
image: busybox
command: ["sleep"]
args: ["infinity"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: test-sa
namespace: test-ns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: test-role
namespace: test-ns
rules:
- apiGroups: [""] # core API group
resources: ["pods"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["get", "create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: test-rb
namespace: test-ns
subjects:
- kind: ServiceAccount
name: test-sa
namespace: test-ns
roleRef:
kind: Role
name: test-role
apiGroup: rbac.authorization.k8s.io
and this is the script, I'm trying to execute:
from kubernetes import client, config
from kubernetes.stream import stream
# Usage example
NAMESPACE = "test-ns"
POD_NAME = "test-pod"
SERVICE_ACCOUNT = "test-sa"
COMMAND = ["ls"]
def main():
# read $KUBECONFIG and instantiate core_api
config.load_kube_config()
api_client = client.ApiClient()
core_api = client.CoreV1Api(api_client)
#######################################
# Test case 1: works
#######################################
resp = stream(core_api.connect_get_namespaced_pod_exec,
POD_NAME,
NAMESPACE,
command=COMMAND,
stderr=True,
stdin=False,
stdout=True,
tty=False)
print(resp)
# switch to service account
tokenRequest = client.AuthenticationV1TokenRequest(
spec=client.V1TokenRequestSpec(audiences=[""])
)
token = core_api.create_namespaced_service_account_token(
SERVICE_ACCOUNT, NAMESPACE, tokenRequest)
api_client.configuration.api_key['authorization'] = token.status.token
api_client.configuration.api_key_prefix['authorization'] = 'Bearer'
api_client.configuration.key_file = None
api_client.configuration.cert_file = None
# re-instantiate core_api
core_api = client.CoreV1Api(api_client)
#######################################
# Test case 2: works
#######################################
pods = core_api.list_namespaced_pod(NAMESPACE)
print(pods)
#######################################
# Test case 3: triggers an exception, even though the corresponding kubectl call works:
# kubectl exec -it -n test-ns test-pod --as system:serviceaccount:test-ns:test-sa -- ls
#######################################
resp = stream(core_api.connect_get_namespaced_pod_exec,
POD_NAME,
NAMESPACE,
command=COMMAND,
stderr=True,
stdin=False,
stdout=True,
tty=False)
print(resp)
if __name__ == '__main__':
main()
Environment:
- Kubernetes version (
kubectl version):
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.6
- OS: Microsoft vscode dev container image
base:1.0.10-bullseyeon Docker Desktop 4.21.1 on WSL2 on Windows 10 Version 22H2, Build 19045.4780 - Python version: 3.12.6
- Python client version: 31.0.0
It's works from within a pod container running with the correct service account when using load_incluster_config instead of load_kube_config. That means, the 'switch to service account' part is not necessary.
But it's very annoying and time-consuming if I need to build the docker image and redeploy it, before I can test it. I would like to be able to run the script from outside of the cluster, but having the same RBAC permissions. (Note: it would be really hard to set up a working unit test environment that mocks away all external dependencies for my purpose)
/help
@roycaihw: This request has been marked as needing help from a contributor.
Guidelines
Please ensure that the issue body includes answers to the following questions:
- Why are we solving this issue?
- To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
- Does this issue have zero to low barrier of entry?
- How can the assignee reach out to you for help?
For more details on the requirements of such an issue, please see here and ensure that they are met.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.
In response to this:
/help
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Comparing the configuration in both cases, there is a difference: the incluster config uses the internal IP address of the kubernetes API, while the KUBECONFIG config uses the external IP address. The cacert in both cases is the same. Not sure, if this is relevant.
Unfortunately changing the IP address during the 'switch to service account' process (by assigning api_client.configuration.host) will lead to an SSL error:
Traceback (most recent call last):
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connection.py", line 730, in connect
sock_and_verified = _ssl_wrap_socket_and_match_hostname(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connection.py", line 909, in _ssl_wrap_socket_and_match_hostname
ssl_sock = ssl_wrap_socket(
^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/util/ssl_.py", line 469, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/util/ssl_.py", line 513, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python/current/lib/python3.12/ssl.py", line 455, in wrap_socket
return self.sslsocket_class._create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python/current/lib/python3.12/ssl.py", line 1041, in _create
self.do_handshake()
File "/usr/local/python/current/lib/python3.12/ssl.py", line 1319, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1000)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
urllib3.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:1000)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/workspaces/misc/python-client-test/python-client-test.py", line 88, in <module>
main()
File "/workspaces/misc/python-client-test/python-client-test.py", line 48, in main
pods = core_api.list_namespaced_pod(NAMESPACE)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api/core_v1_api.py", line 15823, in list_namespaced_pod
return self.list_namespaced_pod_with_http_info(namespace, **kwargs) # noqa: E501
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api/core_v1_api.py", line 15942, in list_namespaced_pod_with_http_info
return self.api_client.call_api(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 373, in request
return self.rest_client.GET(url,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/rest.py", line 244, in GET
return self.request("GET", url,
^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/kubernetes/client/rest.py", line 217, in request
r = self.pool_manager.request(method, url,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/_request_methods.py", line 135, in request
return self.request_encode_url(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/_request_methods.py", line 182, in request_encode_url
return self.urlopen(method, url, **extra_kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/poolmanager.py", line 443, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 873, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 873, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 873, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/workspaces/misc/python-client-test/.venv/lib/python3.12/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='172.16.128.1', port=443): Max retries exceeded with url: /api/v1/namespaces/test-ns/pods (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1000)')))
For anyone else stumbling upon this, this can also happen if your authentication token expires and you don't refresh it.
In my case I was using Google Cloud, so I just needed to refresh the credentials before making any API calls if running for more than half an hour.
Example: updating the K8S credentials with a new token derived from Google Application Default Credentials:
from kubernetes import client
from google.auth import default
from google.auth.transport.requests import Request
credentials, _ = default()
def refresh():
credentials.refresh(Request())
configuration = client.Configuration.get_default_copy()
configuration.api_key = {"authorization": f"Bearer {credentials.token}"}
client.Configuration.set_default(configuration)