che icon indicating copy to clipboard operation
che copied to clipboard

Gateway: Configbump does not refresh expired service account tokens, leading to ConfigMap sync failure

Open vgiardino-sw opened this issue 3 months ago • 2 comments

Describe the bug

This issue appears related to issue #23230, which addresses a similar problem context.

Problem Description

When running che-incubator/configbump (version 7.104.0) in Kubernetes clusters (tested on AKS v1.32.6), the application begins logging repeated authentication errors after approximately one hour of uptime:

reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to watch *v1.ConfigMap: the server has asked for the client to provide credentials (get configmaps)

reflector.go:126] pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:94: Failed to list *v1.ConfigMap: Unauthorized

As a result, ConfigMaps are no longer synchronized, which breaks configuration propagation to consuming components.

Root Cause The issue stems from configbump using an outdated version of k8s.io/client-go that does not support automatic ServiceAccount token reloading

Background Since Kubernetes v1.21, the BoundServiceAccountTokenVolume feature is enabled by default, introducing the following behavior:

  • Tokens mounted in pods at /var/run/secrets/kubernetes.io/serviceaccount/token are now short-lived.
  • These tokens are automatically refreshed on disk by the kubelet before expiry.
  • However, clients must re-read the token from disk periodically to keep using a valid token.

According to the Kubernetes CHANGELOG for v1.21:

k8s.io/client-go version v11.0.0+ and v0.15.0+ reload tokens automatically.

However, inspecting the code shows that configbump is using a very old version of client-go (ref):

k8s.io/client-go v11.0.1-0.20190409021438-1a26190bd76a+incompatible

Although this appears to be version 11, the timestamp reveals it's an older, pre-release version from 2019, before support for automatic token rotation was introduced.

Proposed Solution

We locally rebuilt the che-incubator/configbump image using a newer client-go version (v0.17.2), and the issue was resolved.

We recommend upgrading the dependencies to more recent versions:

  • Upgrade k8s.io/client-go to v0.21.0 or newer
  • Update controller-runtime accordingly

This would allow configbump to fully support token rotation and avoid disruptions after token expiration.

Adittional Resources:

vgiardino-sw avatar Oct 07 '25 17:10 vgiardino-sw

CC'ing @mkuznyetsov for comment. Seems like if it's reproducible with the given steps, this is a dependency update for configbump.

rgrunber avatar Oct 08 '25 04:10 rgrunber

Hi @rgrunber, @mkuznyetsov , any news on this?

LMantovan avatar Oct 27 '25 07:10 LMantovan