kepler
kepler copied to clipboard
Refactor pod_lister
pod_lister issue: Some short-lived pods cannot be detected at the time of ticker (as kubelet does not report deleted pods). proposal: 1. Detect create/delete pod events from API server with kubernetes API 2. Keep the pod/container info until x seconds after the pod is deleted. (now x is set to 30s but we might change it to ticker time).
notices:
- Multiple changes to dependencies (vendors), Need to run go mod vendor to update
- Both take container ID as an input. So, It is still rely on
pod_lister.GetContainerIDFromcGroupIDfunction from previous implementation to convert cGroup ID to container ID. However, for older kernel version, we may need to handle Process ID in stead of cGroup ID. - The modified code comes with unit test and is confirmed functionality on the experimental environment. Nevertheless, it causes relatively major changes, I believe discussion is needed.
Signed-off-by: Sunyanan Choochotkaew [email protected]
@sunya-ch thank you for identifying the issue and getting this fix. I agree the transient pods are quite elusive, this issue deserves some good thinking to get it done.
There are two ways to track transient pods: kube API server and local crio. Getting info from API server is authoritative yet maybe less scalable, while querying crio is more scalable but requires good design around security concerns.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.