kwok
kwok copied to clipboard
Add a pod stage file for failing an initContainer (WIP)
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds a pod stage that injects an error to initContainer specified by a label and annotations, including the initContainer's name and optional failure message, reason, exit code and delay parameters.
labels:
pod-init-container-running-failed.stage.kwok.x-k8s.io: 'true'
annotations:
pod-init-container-running-failed.stage.kwok.x-k8s.io/container-name: <failed initContainer name>
// optional
pod-init-container-running-failed.stage.kwok.x-k8s.io/reason: <failure reason, the default is "initContainerError">
pod-init-container-running-failed.stage.kwok.x-k8s.io/message: <failure message, the default is "initContainer reported errors" >
pod-init-container-running-failed.stage.kwok.x-k8s.io/exit-code: <exit code, the default is 1>
pod-init-container-running-failed.stage.kwok.x-k8s.io/delay: <delay in milliseconds>
pod-init-container-running-failed.stage.kwok.x-k8s.io/jitter-delay: <jitter delay in milliseconds>
Which issue(s) this PR fixes:
Fixes # None
Special notes for your reviewer:
Questions:
- How to set the default values for durationFrom and jitterDurationFrom
- Do we need to set the statues of the other initContainers after the failed initContainer?
Does this PR introduce a user-facing change?
Support the failure of an initContainer in a pod.
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
Deploy Preview for k8s-kwok canceled.
| Name | Link |
|---|---|
| Latest commit | d7d3929564ce72cf7b89ce10516d37b502d5bb48 |
| Latest deploy log | https://app.netlify.com/sites/k8s-kwok/deploys/661d4a7404994a000853919b |
/cc @wzshiming
Thanks for the fixes, @wzshiming! All the changes were committed. I still have two questions.
- Would it be possible to use some default values if the
duration/jitterDurationannotations are not specified? - The statuses of those initContainers defined after the failed initContainers should NOT be
terminatedwithCompleted, right?
Would it be possible to use some default values if the duration/jitterDuration annotations are not specified?
Sure, set both durationMilliseconds and durationFrom.expressionFrom, the durationMilliseconds will like a default for it.
The statues of those initContainers after the failed initContainers should NOT be terminated with Completed, right?
Yes, the Completed reason is only for the 0 exit code
https://github.com/containerd/containerd/blob/406e9e84b4fb0a6bf2912f56ff1dfe78a56ed5ee/internal/cri/server/container_status.go#L89-L95
Would it be possible to use some default values if the duration/jitterDuration annotations are not specified?
Sure, set both
durationMillisecondsanddurationFrom.expressionFrom, thedurationMillisecondswill like a default for it.
Set both durationMilliseconds and jitterDurationMillisecons to 1000 by default (no jitter).
The statues of those initContainers after the failed initContainers should NOT be terminated with Completed, right?
Yes, the
Completedreason is only for the 0 exit codehttps://github.com/containerd/containerd/blob/406e9e84b4fb0a6bf2912f56ff1dfe78a56ed5ee/internal/cri/server/container_status.go#L89-L95
My question was whether we should distinguish an initContainers defined before the failed initContainers and one after the failed initContainer. The former's state is terminated while the latter's is waiting.
/lgtm
Thank you, I think it's a good start for chaos testing
Removed the WIP label. Can you approve it again? Thanks!
/retest
New changes are detected. LGTM label has been removed.
Fixed the linkyaml issues (trailing spaces). PTAL, @wzshiming. Thanks!
/approve /lgtm /label tide/merge-method-squash
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: wzshiming, yuanchen8911
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [wzshiming]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment