Can't build plugin registry on the dogfooding instance
Describe the bug
It's not possible to build plugin registry image on doggfooding instance. The error message is no space left on device:
ERRO[0062] error deleting build container "d0313f27de952575f60db71659b2710db32e29d47990d62854706af3c542d1af": identifier is not a container
Error: identifier is not a container: preparing container for next step: creating build container: creating container: creating read-write layer with ID "dd8750be8985f4742d377f7d804f61861d298cb0e57cf732c64057fd63709d98": **no space left on device**
Che version
next (development version)
Steps to reproduce
- Start a ws from https://che-dogfooding.apps.che-dev.x6e0.p1.openshiftapps.com/#/https://github.com/redhat-developer/devspaces/tree/sv-fix-devfile
- Run "5. Build" command
Expected behavior
The image should be built
Runtime
OpenShift
Screenshots
Installation method
chectl/next
Environment
Linux
Eclipse Che Logs
No response
Additional context
No response
@ibuziuk @dkwon17 when I change storage type to Ephemeral, the image could be built without errors. By default I have storage type Per-workspace
@dkwon17 could you please comment? is it related to https://github.com/eclipse-che/che/issues/22914 ?
It's not related to https://github.com/eclipse-che/che/issues/22914, we are running into this issue because persistent user home is enabled, and because images are stored in the home directory where space is running out.
I noticed building the plugin registry image requires about 17GB, while the PVC for the workspace only has the default 5GB.
There are a few solutions to make the build work:
- Change the storagetype to
Ephemeral - Introduce a very large volume specifically for the images (example)
- Use fuse overlay to decrease the stored image layer sizes (this is not configured on the dogfooding cluster yet)
- Try to edit the
~/.config/containers/storage.conffile to change the graphroot directory to another directory outside of the home directory (docs)
@dkwon17 thank you for the information
Ephemeral storage type works for me
I've tried to set a volume 35Gi and 40Gi, but the build was failed:
what I don't understand why it used to work before and now it doesn't
I think it stopped working recently because we have persistent home enabled and per-workspace persistent home was fixed recently: https://github.com/devfile/devworkspace-operator/pull/1241
I think it stopped working recently because we have persistent home enabled and per-workspace persistent home was fixed recently: devfile/devworkspace-operator#1241
This is also what I believe caused this issue to arise. Previously, the $HOME directory where the graphroot directory resides was ephemeral, until persistent $HOME was fixed for per-workspace storage.
Thinking about the potential solutions @dkwon17 suggested:
-
Change the storagetype to
Ephemeral:- Good easy workaround that can be accomplished by users quickly, however persistent storage is useful in many cases and losing persistent storage could break workflows/degrade UX.
-
Introduce a very large volume specifically for the images (example)
- Easy to accomplish, but seems to be a wasteful (and expensive) use of resources.
-
Use fuse overlay to decrease the stored image layer sizes (this is not configured on the dogfooding cluster yet)
- Seems like the optimal solution (and is my personal favourite), but requires configuration and upgrading to OCP 4.15. This allows us to keep persistent workspace storage without needing to waste too much PVC space for image layers. Additionally, since we are still storing image layers on the PVC, UX is improved as users can re-start their workspace and resume building images without having to re-pull or create image layers.
-
Try to edit the
~/.config/containers/storage.conffile to change the graphroot directory to another directory outside of the home directory (docs)- This is my second favourite option as we don't waste any PVC space. However, image layers would be ephemeral and have to be re-pulled or created upon workspace restart.
- We could potentially explore adding a new environment variable used in the UDI, e.g.
$PERSIST_IMAGE_LAYERSthat could be enabled/disabled at the Devfile level: when enabled, the graphroot directory resides in the $HOME directory, and when disabled, graphroot resides outside of $HOME. This would be a bit quirky, documentation-wise, as$PERSIST_IMAGE_LAYERSwould depend on persistUserHome being enabled in the Che Cluster CR.
Another option is to define an ephemeral volume for /home/user/.local:
https://github.com/dkwon17/devspaces/blob/ce03e4acafc880c063e441202eff12ef40e330d4/.devfile.yaml#L23-L25
problem was resolved by updating devfile, now it's possible to build the plugin reg in dogfooding instance