code-intelligence
code-intelligence copied to clipboard
label bot workers stop receiving pubsub messages; issue with workload identity?
As part of #70 we deployed the workers on an update cluster which uses workload identity.
I'm observing that after the workers have been up for a long time they appear to stop receiving pubsub notifications.
This is visible in cloud console as a growing backlog of messages.
I suspect an issue related to credentials and workload identity. Bouncing the pods appears to fix it.
Related to #70
Issue-Label Bot is automatically applying the label kind/bug to this issue, with a confidence of 0.96. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!
Links: app homepage, dashboard and code for this bot.
Issue-Label Bot is automatically applying the labels:
| Label | Probability |
|---|---|
| kind/bug | 0.96 |
Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.
To try to recover
- Delete the gke metadata servers
kubectl -n kube-system delete pods -l k8s-app=gke-metadata-server
- Restart the label bot pods
kubectl delete pods -l app=label-bot