Re-run initContainers in a Deployment when containers exit on error
Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature
What happened: Container in a Deployment exits on error, container is restarted without first re-running the initContainer.
What you expected to happen: Container in a Deployment exits on error, initContainer is re-run before restarting the container.
How to reproduce it (as minimally and precisely as possible):
Sample spec:
kind: "Deployment"
apiVersion: "extensions/v1beta1"
metadata:
name: "test"
labels:
name: "test"
spec:
replicas: 1
selector:
matchLabels:
name: "test"
template:
metadata:
name: "test"
labels:
name: "test"
spec:
initContainers:
- name: sleep
image: debian:stretch
imagePullPolicy: IfNotPresent
command:
- sleep
- 1s
containers:
- name: test
image: debian:stretch
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- exit 1
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version):
Client Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-master+$Format:%h$", GitCommit:"db809c0eb7d33fac8f54d8735211f2f3a8fc4214", GitTreeState:"clean", BuildDate:"2017-09-11T19:46:47Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-master+$Format:%h$", GitCommit:"db809c0eb7d33fac8f54d8735211f2f3a8fc4214", GitTreeState:"clean", BuildDate:"2017-09-11T19:46:47Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
- OS (e.g. from /etc/os-release):
Debian GNU/Linux 9 (stretch) - Kernel (e.g.
uname -a):Linux aleinung 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u3 (2017-08-06) x86_64 GNU/Linux
Implementation Context:
I have an initContainer that waits for a service running in Kubernetes to detect its existence via pod annotations, and send it an HTTP request, upon which it writes this value to disk. The main container then reads this value upon startup and "unwraps" it via another service, upon which it stores the unwrapped value in memory.
The value that is written to disk by the initContainer is a one-time read value, in that once it is used the value is then expired. The problem is that if the main container ever restarts due to fatal error, it loses that unwrapped value and upon startup tries to unwrap the expired value again, leading to an infinite crashing loop until I manually delete the pod, upon which a new pod is created, the initContainer runs, and all is again well.
I desire a feature that restarts the entire pod upon container error so that this workflow can function properly.
/sig node
Good catch, When use init container to acquire certificate or token, containers may remove it after read to cache. Then containers may re-run repeatedly after once panic.
@aisengard I think the use-case you are talking about can be simulated by sharing volume between container and init-container, isn't it?
I have updated the config to include volumeMount that is shared between container and init-container.
kind: "Deployment"
apiVersion: "extensions/v1beta1"
metadata:
name: "test"
labels:
name: "test"
spec:
replicas: 1
selector:
matchLabels:
name: "test"
template:
metadata:
name: "test"
labels:
name: "test"
spec:
initContainers:
- name: sleep
image: debian:stretch
imagePullPolicy: IfNotPresent
command:
- sh
- -c
- 'echo "create by init-container" > /dir/file'
volumeMounts:
- mountPath: /dir
name: shared
containers:
- name: test
image: debian:stretch
imagePullPolicy: IfNotPresent
command:
- sh
- -c
- "cat /dir/file && sleep 99999s"
volumeMounts:
- mountPath: /dir
name: shared
volumes:
- name: shared
emptyDir: {}
running it:
$ k create -f file.yml
deployment "test" created
$ k get pods
NAME READY STATUS RESTARTS AGE
test-3165636750-b497p 1/1 Running 0 4s
$ k logs test-3165636750-b497p
create by init-container
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Prevent issues from auto-closing with an /lifecycle frozen comment.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale
/cc @Random-Liu @yujuhong
/remove-lifecycle stale
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#pod-restart-reasons
All containers in a Pod are terminated while restartPolicy is set to Always, forcing a restart, and the Init Container completion record has been lost due to garbage collection.
When all containers are terminated, the pod didn't restart, and the Init Container didn't rerun.
How about introducing a new RestartPolicy value such as AlwaysPod which means always restart the pod whenever any container dies?
I.e. the whenever any (non-init) container of the pod dies then the remaining healthy containers should be terminated and the pod restarted (in the same node), starting with the init-containers.
This approach can cover one of the common/simple use-case for init-containers to wait for dependencies or some action required before any of the pod's containers (re)start.
An AlwaysPod or perhaps RestartPod RestartPolicy would also be very useful for this use-case: I'm using initContainers to get a job from a work-queue and then sequentially run a series of containers to process this job. Once the job is finished I'd like to pod to restart and thus wait for the next job. The Job resource doesn't seem to support indefinite completions. I don't want the overhead of something like brigade or argo for what should be a pretty simple use-case.
The AlwaysPod RestartPolicy can also be used to make a stateful apps/services more self-managed using init-containers and side-car containers for some of the simpler management tasks such as backups while using controllers/operators for more complicated operations.
Is there any current workaround to restart pod when container restarts?
All containers in a Pod are terminated while restartPolicy is set to Always, forcing a restart, and the Init Container completion record has been lost due to garbage collection.
I read that in the docs and see it again here....has anyone confirmed a way of getting a state in which pods will be restarted when the container restarts, as it is stated in documentation?
@majgis AFAIK, the only possible work-around is to bake in some co-ordination between the containers of the pod.
I am working on a PR for implementing the AlwaysPod restartPolicy which will address this problem of restarting pod on container failure. I am planning to raise the PR next week.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity. Reopen the issue with
/reopen. Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@aisengard did you ever find a solution to this? We hit exactly the same issue today, we have an initContainer to read some secret data from vault and write it to an emptyDir volume which is shared between the initContainer and the first container in the pod. The first container reads this file when executing the command and then deletes it so no one can enter the pod and read the file; but if the container restarts the initContainer isn't run so the file doesn't exist
/reopen
@adamzr: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Is there any update or a different issue opened for this?
Any idea how to solve this issue? It shuts down my production backend system.
The only solution which I have now is to "manually" (via CRON task) delete pod if it reach CrashLoopBackOff state.
/reopen
Still an issue I think.
@jsravn: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Still an issue I think.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Any idea how to solve this issue? It shuts down my production backend system.
The only solution which I have now is to "manually" (via CRON task) delete pod if it reach CrashLoopBackOff state.
@petrknap another idea, while also a hack, is to use one of the various k8s client API libraries to watch for this condition and delete the pod upon detection.
A few downsides to this, amongst I'm sure many others, is that if the new pod is scheduled on another node the image has to be re-pulled, you lose the 'backoff' logic and associated pausing, and any metrics & alerts related to crash looping likely need adjusting. In my case I'm handling the latter two concerns with another layer of hacks inside the "crash loop watcher" and would certainly much rather have native handling of this.
This would be very useful for us as well.
We have a daemonset which creates a gRPC socket on the node. This is created with root group ownership. Then we have a second daemonset which runs and accesses this socket to capture events logged by the first daemonset and export them to prometheus. Because this socket is owned by root, the "exporter" container requires root.
To work around the root requirement, we set up an init container in the exporter daemonset manifest to change the socket permissions on the host volume. When the container in the first daemonset is killed (OOM), the group ownership of the socket reverts back to root. The exporter container then crashloops.
At this point we expected that the exporter daemonset should re-run the initcontainer first to fix the socket permissions, but it is not, so the exporter continues to crashloop.
Restarting the exporter daemonset or deleting the exporter pods seems to work but it would be great if the whole pod would restart and run the init container again.
Good catch, When use init container to acquire certificate or token, containers may remove it after read to cache. Then containers may re-run repeatedly after once panic.
Exactly my use case. It would be really nice if there was a solution for this.
bump
bump
I ran into the similar issue, namely as follows.
we are running bitnami redis on our kubernetes cluster as statefullset and each pod has following components. helm chart : redis-16.8.10 app: 6.2.7
- init container -> to issue the permissions for pv
- redis container
- sentinel container
- metrics container
After long running time, some how redis container is failing with error as " Can't open the append-only file: Read-only file system" and the redis container is getting restarted continuously while other pods are running as expected.
The only way to get away from this situation is to kill the existing pod, thus warrants creation of new pod. But kubernetes doesn't kill the pod instead restarts the problematic container.
Can we have a feature something like executing init container everytime other container gets restarted/ being able kill the pod and wait for another pod spin?
/reopen
this is such unexpected behaviour as a long time k8s user. the proposed solution to add another value to RestartPolicy which forces all containers to restart seems reasonable - any thoughts from sig-node?