builder Builder pods not removed after deploy

Currently (as of deis-builder v2.7.1) the slugbuild and dockerbuild pods are not deleted after a successful or failed build.

This means that the pod (eg. slugbuild-example-e24fafeb-b31237bb) will continue to exist in state "Completed" or state "Error" and the docker container associated with the pod can never be garbage collected by Kubernetes, causing the node to quickly run out of disk space.

Example:

On a k8s node with an uptime of 43 days and 95 GB disk storage for docker there where 249 completed (or some erred) slugbuild and dockerbuild pods whose docker images accounted for 80 GB of disk storage, while the deployed apps and deis services only required 15 GB storage.

Expected Behavior:

The expected behavior for the builder would be, that it automatically deletes the build pod after is has completed or erred, so that the K8s garbage collection can remove the docker containers which frees the disk space allocated to them.

Feb 20 '17 21:02 felixbuenemann

This behavior can easily inspected with:

kubectl get --namespace deis --show-all pods | grep build-

The number of completed pods will increase by one for each build.

Feb 20 '17 21:02 felixbuenemann

related: https://github.com/deis/builder/issues/57

This seems like in recent versions of k8s, they stopped cleaning up pods in the "success" state. Probably some research needs to be done on how to turn this functionality back on.

Feb 21 '17 01:02 bacongobbler

I'm running K8s 1.4.x if that matters.

Regarding #57 suggestion for Jobs – neither Jobs nor Pods are removed automatically.

From the K8s Job docs:

When a Job completes, no more Pods are created, but the Pods are not deleted either. Since they are terminated, they don’t show up with kubectl get pods, but they will show up with kubectl get pods -a. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml). When you delete the job using kubectl, all the pods it created are deleted too.

Feb 21 '17 09:02 felixbuenemann

Interestingly the docs on Pod Lifecycle say:

In general, Pods do not disappear until someone destroys them. This might be a human or a controller. The only exception to this rule is that Pods with aphase of Succeeded or Failed for more than some duration (determined by the master) will expire and be automatically destroyed.

This seems to be in contrast to what I'm actually seeing…

Feb 21 '17 10:02 felixbuenemann

I have opened kubernetes/kubernetes#41787 for clarification of the above statement from the docs.

Feb 21 '17 11:02 felixbuenemann

I just got feedback to the kubernetes issue, it looks like by default completed or failed pods are garbage collected if there are more than 12,500 pods. Obviously that is not very helpful in this case, so an automatic cleanup by the builder should be implemented.

Feb 27 '17 22:02 felixbuenemann

Quoting here from the kube-controller-manager help on the --terminated-pod-gc-threshold <n> option:

Number of terminated pods that can exist before the terminated pod garbage collector starts deleting terminated pods. If <= 0, the terminated pod garbage collector is disabled. (default 12500)

Mar 06 '17 11:03 felixbuenemann

Any progress on this ? Sounds like a waste of resources and space for everyone.

Mar 20 '17 17:03 kwent

Same here, it may be linked to a issue I've opened last week.

$ kubectl get --namespace deis --show-all pods | grep build-
slugbuild-teslabit-web-production-d2fcd4c0-7e507178   0/1       Completed   0          1d

Apr 06 '17 14:04 pfeodrippe

I'm using this tiny git pre-push hook for deletion https://gist.github.com/pfeodrippe/116c8b570ee2ffcdce8aa15bbae5a22b.

It deletes the last slugbuild created for the app when you git push

Apr 12 '17 14:04 pfeodrippe

+1 This bit me after a couple of weeks of deploying applications to my deis cluster.

Jul 25 '17 02:07 davidlmorton

This issue was moved to teamhephy/builder#17

Mar 21 '18 13:03 Cryptophobia

builder builder copied to clipboard

Builder pods not removed after deploy

builder
builder copied to clipboard