Problem with /tmp file
Describe the bug Flux fills up /tmp
To Reproduce Flux copies the repository at certain time intervals but does not delete files from the tmp directory, causing it to become full.
Expected behavior Fluxcd should probably clean-up after itself?
Logs no space left on device"
Flux version: 2.4.0
Fluxcd should probably clean-up after itself?
It does so probably something else is blocking it on the node. Look at source-controller logs for any errors.
No, I checked source-controller logs and I found only 'info' level logs, any errors. So I looks that /tmp is not clean up
We noticed the same issue. We run the source controller in kubernetes with the /tmp mounted on an emtyDir volume (see spec in this chart).
As far as I can tell every time we run flux reconcile source git mygitrepo a new folder is created in /tmp that looks like /tmp/gitrepository-flux-system-mygitrepo-someid. This folder exists for a brief period (I assume some checksums are compared) and then it gets removed.
The problem is that this removal doesn't always happen. I can't consistently reproduce it but I see some leftover folders that were never deleted. In our case it eventually leads to increased memory usage and the container gets OOMKilled.
I checked the logs but all I have is "info" level messages and no indication of a problem whatsoever.
$ flux version
flux: v2.5.1
distribution: flux-2.5.1
helm-controller: v1.2.0
image-automation-controller: v0.40.0
image-reflector-controller: v0.34.0
kustomize-controller: v1.5.1
notification-controller: v1.5.0
source-controller: v1.5.0
I can't explain how would the tmp cleanup fail without having the error logged:
https://github.com/fluxcd/source-controller/blob/4aa31dcc21fa570122d91678ab6352d050481374/internal/controller/gitrepository_controller.go#L279-L293
No matter what happens during the reconciliation, we remove the tmp dir and if it fails, we log an error.
I suspect the cleanup fails if the container gets OOMKilled in the middle of it. After this happens a few times and more garbage is accumulated in tmp the container constantly gets OOMKilled on startup. The workaround for us was to delete the pod so both /tmp and /data (emptyDir) are wiped. We also allocated more memory and it no longer seems to happen.