Kong pod doesnt start up after cluster restart : needs Kong 3.9
What happened?
The issue is:
https://github.com/Kong/kong/issues/13730#issuecomment-2650153987
We are using latest kubernetes-dashboard which inturn uses kong. Whenever the cluster nodes are restarted (may it be my windows machine running docker-desktop with single node k8s, or may it be our 3 node dev env k8s cluster running on latest Ubuntu and containerD), I get this same issue:
nginx: [emerg] bind() to unix:/kong_prefix/sockets/we failed (98: Address already in use)
Once I delete the kong pod, all comes back to normalcy. Will appreciate a fix for this.
It is already fixed in Kong. Hence Kong should be updated to 3.9 to fix the issue.
For now, we can work around by overriding the Kong version as this:
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard \
--create-namespace \
--namespace kubernetes-dashboard \
--set kong.image.repository=kong \
--set kong.image.tag="3.9.0"
What did you expect to happen?
Whenever the cluster nodes are restarted (may it be my windows machine running docker-desktop with single node k8s, or may it be our 3 node dev env k8s cluster running on latest Ubuntu and containerD), we should still be able to access k8s-dashboard.
How can we reproduce it (as minimally and precisely as possible)?
Just restart all the k8s nodes.
Anything else we need to know?
No response
What browsers are you seeing the problem on?
No response
Kubernetes Dashboard version
7.10.4
Kubernetes version
1.30.5
Dev environment
No response
I have also encountered the same issue with kubernetes-dashboard-7.11.1 and kong 3.8 in wsl2
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Any news on this?
Hello, Do we have any news regarding this issue? We really need on our side this fix to be available ASAP as we are facing this error on pretty much a regular basis....
If this could be merged, thanks!
TL;DR: try with
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard \
--create-namespace \
--namespace kubernetes-dashboard \
--set kong.image.repository=docker.io/library/kong \
--set kong.image.tag="3.9.0"
Diagnose:
kubectl describe pod $(kubectl get pods -n kubernetes-dashboard -l app.kubernetes-dashboard=kong -o jsonpath='{.items[0].metadata.name}') -n kubernetes-dashboard
In output you should see something like Failed to inspect image "": rpc error: code = Unknown desc = short name mode is enforcing, but image name kong:3.8 returns ambiguous list
Explanation:
In my case, the issue is related to how the container runtime resolves short image names.
On Kubernetes nodes running CRI-O (and also Podman), image resolution is controlled by /etc/containers/registries.conf. This file defines:
unqualified-search-registries→ the list of registries to search when a short name likekong:3.8is used.short-name-mode→ whether ambiguous short names are allowed. Possible values areenforcing,permissive, ordisabled.
You can verify the current settings on a node with:
cat /etc/containers/registries.conf | grep -A2 short-name-mode
In theory, setting short-name-mode = "permissive" should allow the runtime to try the registries in the search list and pull the image. However, in my case this did not work, most likely because the runtime still failed to resolve the ambiguity: kong:3.8 could point either to docker.io/library/kong or something else. Since CRI-O couldn’t make that distinction, the pull continued to fail.
It’s also possible to confirm this kind of error directly from Kubernetes by inspecting the pod events.
kubectl describe pod $(kubectl get pods -n kubernetes-dashboard -l app.kubernetes-dashboard=kong -o jsonpath='{.items[0].metadata.name}') -n kubernetes-dashboard
The Events section will show the ImageInspectError and the “short name mode is enforcing/permissive…” message, which makes it clear that the runtime is struggling to resolve the image reference.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 66s default-scheduler Successfully assigned kubernetes-dashboard/kubernetes-dashboard-kong-6465d78779-9jd87 to kube-bootstrap-yf3l0.mkt.local
Normal AddedInterface 66s multus Add eth0 [10.85.0.30/16 1100:200::1e/24] from crio
Warning InspectFailed 51s (x4 over 66s) kubelet Failed to inspect image "": rpc error: code = Unknown desc = short name mode is enforcing, but image name kong:3.8 returns ambiguous list
Warning Failed 51s (x4 over 66s) kubelet Error: ImageInspectError
Warning InspectFailed 13s (x3 over 25s) kubelet Failed to inspect image "": rpc error: code = Unknown desc = short name mode is enforcing, but image name kong:3.8 returns ambiguous list
Warning Failed 13s (x3 over 25s) kubelet Error: ImageInspectError
In my opinion, the cleanest way to fix it is not to rely on short names at all but to set the full repository explicitly in the Helm values. For example:
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard \
--create-namespace \
--namespace kubernetes-dashboard \
--set kong.image.repository=docker.io/library/kong \
--set kong.image.tag="3.9.0"
This way, the runtime doesn’t need to guess, and the deployment will consistently pull the correct Kong image.
The long-term fix would be to adjust the default value in the Helm chart itself. In the upstream [values.yaml](https://github.com/kubernetes/dashboard/blob/master/charts/kubernetes-dashboard/values.yaml), the kong section could be updated to explicitly specify the repository, for example:
kong:
image:
repository: docker.io/library/kong
tag: "3.9.0"
This way, anyone installing the chart avoids the ambiguity without having to override the values manually.
In my opinion, the cleanest way to fix it is not to rely on short names at all but to set the full repository explicitly in the Helm values. For example:
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard
--create-namespace
--namespace kubernetes-dashboard
--set kong.image.repository=docker.io/library/kong
--set kong.image.tag="3.9.0"This way, the runtime doesn’t need to guess, and the deployment will consistently pull the correct Kong image.
The long-term fix would be to adjust the default value in the Helm chart itself. In the upstream [values.yaml](https://github.com/kubernetes/dashboard/blob/master/charts/kubernetes-dashboard/values.yaml), the
kongsection could be updated to explicitly specify the repository, for example:kong: image: repository: docker.io/library/kong tag: "3.9.0"
This way, anyone installing the chart avoids the ambiguity without having to override the values manually.
I'm agreeing with the long-term solution, seems like a no brainer for extras in values.yaml? If maintainers don't enough time to works on that can we at least have a bump version at: https://github.com/kubernetes/dashboard/blob/2f3f7b01a3e23a7a2c0011455a165ef68af81c10/charts/kubernetes-dashboard/Chart.yaml#L47 According to their helm chart 3.9 is already at 2.47 at least
pi@pi$ helm search repo kong/kong --versions | egrep 3.9
kong/kong 2.52.0 3.9 The Cloud-Native Ingress and API-management
kong/kong 2.51.0 3.9 The Cloud-Native Ingress and API-management
kong/kong 2.50.0 3.9 The Cloud-Native Ingress and API-management
kong/kong 2.49.0 3.9 The Cloud-Native Ingress and API-management
kong/kong 2.48.0 3.9 The Cloud-Native Ingress and API-management
kong/kong 2.47.0 3.9 The Cloud-Native Ingress and API-management