troubleshoot
troubleshoot copied to clipboard
filter out any pods in status.phase=Failed
Logs collector was failing when it encountered a pod in the Shutdown state - filter out any pods in status.phase=Failed
Ref https://app.shortcut.com/replicated/story/53140/troubleshoot-stops-collecting-logs-when-it-encounters-a-pod-in-shutdown-state
This shouldn't be hardcoded here. This function is used to collect logs.
Today, the logs collector can't get logs from the underlying filesystem, and the API won't return logs from a pod that's been Shutdown - I think the same problem occurs if you try to exec into a pod in Shutdown.
The change here adds a filter to listPodsInSelector and I find the only places that use that function are in exec, logs, ceph, and copy collectors, and it's used to build a list of pods to call either logs or exec against - since both logs and exec will fail against a Shutdown pod I think it's a safe change. @divolgin do you think this is the incorrect approach?
need to double check if #643 resolves this already
I'm going to close this for now and re-open if it's necessary just to get this off the radar