krane
krane copied to clipboard
Timeout if pods are stuck in ContainerCreating
In my current project we have some pods that runs as tasks and they can take up to 2h to run, for that we had to set the timeout to be 2h. Sometimes pods get stuck in ContainerCreating in which case we only gonna see that after this timeout period. The question is, there is a way to make kubernetes-deploy to timeout if pods are in this state for more than a few minutes? This would produce faster feedback which I believe is desirable.
Hi @viniciusgama can you tell us a bit more why the tasks need to be deployed as pods and not jobs? We've got logic in the job class that sounds like it handles what you'd want https://github.com/Shopify/kubernetes-deploy/blob/master/lib/krane/kubernetes_resource/job.rb#L7
Thanks for your quick reply @dturn.
I said jobs but I actually meant to say tasks. In my team we are doing as suggested here. In our case we run database migrations, copy assets before we can rollout the application itself but the issue is not restricted to these tasks, it can be any sort of pod really.
Don't think the link you posted will be of any help for me right now. Do you recall any other way this can be achieved? Or we would have to implement something?
Internally, we run log lived db migrations out of band using a job. The template resource looks something like:
apiVersion: batch/v1
kind: Job
metadata:
name: long-db-migrate
spec:
backoffLimit: 3
activeDeadlineSeconds: 172800 # Allow running for 48 hours
template:
metadata:
name: long-db-migrate
spec:
restartPolicy: Never
Would this approach work for you?