It's not possible to build image using UDI
Describe the bug
Currently it is not possible to use podman build, podman info, etc commands in UDI.
The error is:
Error: failed to mount overlay for metacopy check with "" options: permission denied
Che version
next (development version)
Steps to reproduce
- Start any workspace that is used UDI in dev component (for example https://github.com/che-samples/web-nodejs-sample)
- Try to run
podman infocommand in the terminal
Expected behavior
It should be possible to build images in the component that uses UDI
Runtime
OpenShift
Screenshots
Installation method
chectl/latest
Environment
Linux
Eclipse Che Logs
No response
Additional context
It seems like /home/user/.config/containers/storage.conf is missing, but /home/tooling/.config/containers/storage.conf is present. For some reason, it seems like the stow command ~~in the Dockerfile itself is not creating a symbolic link for storage.conf from the /home/tooling/ directory to /home/user/~~ EDIT: The stow command in the Dockerfile is working as expected, it's the stow command in the entrypoint that's failing.
What's weird is that running stow . -t /home/user/ -d /home/tooling/ --no-folding -v 2 > /tmp/stow.log 2>&1 in the Che Code terminal will create this missing symbolic link without error (cat'ing /tmp/stow.log shows no errors for stow).
When looking at the GH Actions for the UDI, the stow command in the Dockerfile is not failing:
#57 [50/53] RUN stow . -t /home/user/ -d /home/tooling/ --no-folding
#57 DONE 6.7s
It's worth adding the -v 2 to the stow command in the Dockerfile probably to see if any errors are getting logged when debugging this, though I doubt it.
My current guess is some file ownership/permissions issue is happening with the related storage.conf files.
What's weird is that the git blame regarding the storage.conf files in the Dockerfile show they haven't been touched in months (for some lines, in years)...
I'm able to reproduce the issue for the empty sample workspace if I have the following in CheCluster:
spec:
devEnvironments:
storage:
pvcStrategy: per-workspace
persistUserHome:
enabled: true
From my testing, this issue happens for workspaces with CheCode versions after this change: https://github.com/che-incubator/che-code/pull/221/files. I cannot reproduce the issue for CheCode images created before that PR.
Also, not only is storage.conf not linked, other files are not linked such as .kubectl_aliases -> ../tooling/.kubectl_aliases:
~ $ ls -la ~
total 12
drwxrwsr-x. 5 root user 123 Apr 10 19:06 .
drwxrwxr-x. 1 root root 21 Apr 9 11:13 ..
-rw-r--r--. 1 user user 141 Apr 10 19:06 .bash_profile
-rw-r--r--. 1 user user 376 Apr 10 19:06 .bashrc
drwxr-sr-x. 3 user user 24 Apr 10 19:06 .config
drwxr-sr-x. 2 user user 20 Apr 10 19:06 .kube
drwx--S---. 3 user user 19 Apr 10 19:06 .local
-rw-r--r--. 1 user user 0 Apr 10 19:06 .stow_completed
-rw-r-----. 1 user user 532 Apr 10 19:06 .viminfo
Thank you for the valuable info @dkwon17 :pray: David and I made some more findings regarding this bug:
As mentioned, persistUserHome needs to be enabled in the Che Cluster CR. When this feature is enabled, stow will be ran from the UDI's entrypoint .
If you check the contents of /tmp/stow.log, you can see that the stow command in the entrypoint is currently failing:
LINK: .gitconfig => ../tooling/.gitconfig
Planning stow of package .... done
Processing tasks...
stow: ERROR: Could not create directory: .local (File exists)
Upon inspection, it seems that /home/user/.local/share/containers/storage/ is being created and populated at some point during workspace startup. If you run quay.io/devfile/universal-developer-image:latest as a standalone container in Docker, there's no /home/user/.local/share/ directory.
Because /home/user/.local/share/... is non-empty, stow aborts when run in the entrypoint, causing other important files to not be symbolically linked, such as .kubectl_aliases and /home/user/.config/containers/storage.conf.
As a temporary workaround for this bug, you can do: rm -rf .local/share/ && rm /home/user/.stow_completed && /entrypoint.sh. Afterwards, running podman info should work as expected.
I don't think the issue is coming from the UDI itself. My guess is this is CheCode related: at some point during the workspace bootstrap process /home/user/.local/share/... is being populated.
I also noticed that if you rm -rf .local/share/, then run podman info (or podman buid, and probably other podman commands), /home/user/.local/share/containers/storage will be created. So it's possible that a recent change to Che Code is executing a podman command, which is causing this bug to occur.
This issue should be resolved now that https://github.com/devfile/devworkspace-operator/pull/1251 is merged. Once the nightly build of DevWorkspace Operator is live on the dogfooding instance, verification should be done to ensure this issue is no longer reproducible.
Now that https://github.com/devfile/devworkspace-operator/pull/1251 & https://github.com/devfile/developer-images/pull/173 are merged, this issue seems to finally be resolved. Great work @dkwon17!