chaoskube
chaoskube copied to clipboard
[bug or feature?] pod being killed continuously
the default namespace has two pods, run go run main.go --interval 10s --namespaces 'default' --no-dry-run
the test pod was killed repeatedly, I think this is unfriendly in some scenarios. Is this expected behavior?
questions:
- i know the
--minimum-age, it's usepod.ObjectMeta.CreationTimestampcheck it, but it is possible that the pod will be killed again while it is still starting. --minimum-agecannot solve the problem of the same pod being killed continuously.- add an option for pod being killed continuously ?
If you think this is a bug or feature, please assign it to me
The killed Pod's name is different, so from the perspective of chaoskube I don't think it's a bug.
The second Pod shouldn't be killed before it's in the Running state (otherwise it would be a bug).
If your "test" application consists of a single Pod, then the --minimum-age setting should help to avoid killing the two Pods right after another.
The killed Pod's name is different, so from the perspective of
chaoskubeI don't think it's a bug.The second Pod shouldn't be killed before it's in the
Runningstate (otherwise it would be a bug).If your "test" application consists of a single Pod, then the
--minimum-agesetting should help to avoid killing the two Pods right after another.
--minimum-age , it's use pod.ObjectMeta.CreationTimestamp check it. so, pods that have been created but are still being started may still be killed. I don't think killing the starting pod is the desired result.
I think it is reasonable to use such an implementation, or add a user-oriented parameter to determine whether to kill the starting pod.
if minimum-age && pod.status == Running {
}
Looking at the code Pods that are not in Running state are filtered out early on (before even checking for minimum age).
What can happen is that if you set minimum age to 5 minutes and the Pod itself stays 5 minutes in "Pending" or "Initializing" state, it can get killed right after it switches to "Running". (Because CreationTimestamp is the time the Pod object was created initially.)
I believe we understand each other. It was my negligence that I didn't notice the "running" status judgment. As you said "it can get killed right after it switches to "Running".", should we avoid this situation?
We should think about it.
If a Pod only gets into the Running state after 5 minutes of initialization and the --minimum-age is set to 2 minutes (for example) then the earliest moment it can be killed should be 7 minutes after the initial creation.
But it might be difficult to implement. Looking at the CreationTimestamp was easy. Using the time a Pod switched to the Running state as the starting point for "minimum age" probably requires to look at the Kubernetes events since there's no such field on the Pod object itself.
However, the current implementation works for most of the cases in real-world clusters that run many Pods. We don't do it currently, but termination during the initialization phase can also be preferable for some users to uncover additional edge cases.