che icon indicating copy to clipboard operation
che copied to clipboard

Can't build plugin registry on the dogfooding instance

Open svor opened this issue 1 year ago • 8 comments

Describe the bug

It's not possible to build plugin registry image on doggfooding instance. The error message is no space left on device:

ERRO[0062] error deleting build container "d0313f27de952575f60db71659b2710db32e29d47990d62854706af3c542d1af": identifier is not a container 
Error: identifier is not a container: preparing container for next step: creating build container: creating container: creating read-write layer with ID "dd8750be8985f4742d377f7d804f61861d298cb0e57cf732c64057fd63709d98": **no space left on device**

Che version

next (development version)

Steps to reproduce

  1. Start a ws from https://che-dogfooding.apps.che-dev.x6e0.p1.openshiftapps.com/#/https://github.com/redhat-developer/devspaces/tree/sv-fix-devfile
  2. Run "5. Build" command

Expected behavior

The image should be built

Runtime

OpenShift

Screenshots

screenshot-che-dogfooding apps che-dev x6e0 p1 openshiftapps com-2024 04 12-13_35_27

Installation method

chectl/next

Environment

Linux

Eclipse Che Logs

No response

Additional context

No response

svor avatar Apr 12 '24 10:04 svor

@ibuziuk @dkwon17 when I change storage type to Ephemeral, the image could be built without errors. By default I have storage type Per-workspace

svor avatar Apr 12 '24 10:04 svor

@dkwon17 could you please comment? is it related to https://github.com/eclipse-che/che/issues/22914 ?

ibuziuk avatar Apr 15 '24 16:04 ibuziuk

It's not related to https://github.com/eclipse-che/che/issues/22914, we are running into this issue because persistent user home is enabled, and because images are stored in the home directory where space is running out.

I noticed building the plugin registry image requires about 17GB, while the PVC for the workspace only has the default 5GB.

There are a few solutions to make the build work:

  • Change the storagetype to Ephemeral
  • Introduce a very large volume specifically for the images (example)
  • Use fuse overlay to decrease the stored image layer sizes (this is not configured on the dogfooding cluster yet)
  • Try to edit the ~/.config/containers/storage.conf file to change the graphroot directory to another directory outside of the home directory (docs)

dkwon17 avatar Apr 15 '24 21:04 dkwon17

@dkwon17 thank you for the information Ephemeral storage type works for me I've tried to set a volume 35Gi and 40Gi, but the build was failed:

screenshot-che-dogfooding apps che-dev x6e0 p1 openshiftapps com-2024 04 16-14_26_58

what I don't understand why it used to work before and now it doesn't

svor avatar Apr 16 '24 12:04 svor

I think it stopped working recently because we have persistent home enabled and per-workspace persistent home was fixed recently: https://github.com/devfile/devworkspace-operator/pull/1241

dkwon17 avatar Apr 16 '24 14:04 dkwon17

I think it stopped working recently because we have persistent home enabled and per-workspace persistent home was fixed recently: devfile/devworkspace-operator#1241

This is also what I believe caused this issue to arise. Previously, the $HOME directory where the graphroot directory resides was ephemeral, until persistent $HOME was fixed for per-workspace storage.

AObuchow avatar Apr 16 '24 14:04 AObuchow

Thinking about the potential solutions @dkwon17 suggested:

  • Change the storagetype to Ephemeral:

    • Good easy workaround that can be accomplished by users quickly, however persistent storage is useful in many cases and losing persistent storage could break workflows/degrade UX.
  • Introduce a very large volume specifically for the images (example)

    • Easy to accomplish, but seems to be a wasteful (and expensive) use of resources.
  • Use fuse overlay to decrease the stored image layer sizes (this is not configured on the dogfooding cluster yet)

    • Seems like the optimal solution (and is my personal favourite), but requires configuration and upgrading to OCP 4.15. This allows us to keep persistent workspace storage without needing to waste too much PVC space for image layers. Additionally, since we are still storing image layers on the PVC, UX is improved as users can re-start their workspace and resume building images without having to re-pull or create image layers.
  • Try to edit the ~/.config/containers/storage.conf file to change the graphroot directory to another directory outside of the home directory (docs)

    • This is my second favourite option as we don't waste any PVC space. However, image layers would be ephemeral and have to be re-pulled or created upon workspace restart.
    • We could potentially explore adding a new environment variable used in the UDI, e.g. $PERSIST_IMAGE_LAYERS that could be enabled/disabled at the Devfile level: when enabled, the graphroot directory resides in the $HOME directory, and when disabled, graphroot resides outside of $HOME. This would be a bit quirky, documentation-wise, as $PERSIST_IMAGE_LAYERS would depend on persistUserHome being enabled in the Che Cluster CR.

AObuchow avatar Apr 16 '24 14:04 AObuchow

Another option is to define an ephemeral volume for /home/user/.local: https://github.com/dkwon17/devspaces/blob/ce03e4acafc880c063e441202eff12ef40e330d4/.devfile.yaml#L23-L25

dkwon17 avatar Apr 29 '24 01:04 dkwon17

problem was resolved by updating devfile, now it's possible to build the plugin reg in dogfooding instance

svor avatar Jul 24 '24 10:07 svor