singularity
singularity copied to clipboard
Possibility to change SINGULARITY_CACHEDIR in config file
Version of Singularity:
2.3.1-dist
Expected behavior
We tested singularity on our cluster and for us it would be a good feature to be able to change the cache directory in the singularity configuration file. This would avoid unwanted files in root's home.
Actual behavior
We bootstrapped some Singularity container from Docker images and files were cached at /root/.singularity/docker.
This is similar to #791 Since this involves a bit of C, I'll like to take a first shot! For learning purposes :)
@vsoch, what C changes do you feel are necessary?
Lol didn't we just talk about this all morning?
We would need a function (which I started writing in the PR above) to parse and return a value from the config. I mentioned C because it has been convention to use it for registry and config, but easily we could have a python function, or other means. We need to read a value from a text file and put it into an environment variable in a shell script... you know what they say about skinning cats!
To be very clear:
- admin sets option in the config for a default cache
- when there is any action with shub/docker, in the bash section, we check for this value (using method X)
- given a default cache is set by the admin, set it
- but still check for environment variable defined cache, which would be a specification by the user to over-ride the admin
the final cache directory wins!
@vsoch
I wonder if this could just be done just with the environment variable?
If the admin sets SINGULARITY_CACHEDIR in either /etc/environment, /etc/profile.d/singularity or some other location, that would have a system-wide effect.
You could than change the cache directory by overwriting the value yourself in a script, your home config, etc.
Would this work? It's possible I'm over looking a use case.
@jscook2345 this would work, but the idea is that for some cluster installation, in the same way you can configure other temporary directories to be standard for your resource, the same is needed for the cache directory. This would be possible (or not) to override by the user then depending on environment variables (and if the admin allows it). Ideally, we allow the admin to set this optimally for the user and thus make use by the user much easier (meaning not needing to know about or set an environment variable).
This ... wouldn't be that great. Are you basically doing something like:
mkdir -m 0777 /tmp/sing_cache
Then setting SINGULARITY_CACHEDIR=/tmp/sing_cache ?
Every user, is going to have access to every other users files they've ran (i.e. docker cache), whether that specific user has access to the specific docker location or not.
You could have the admin to more work, and so you end up with a location like: /tmp/sing_cache/$USER ... but then that sort of defeats the purpose of the cache if it's local to only one machine.
HPC is a general enough use case that we cannot say that one specific use case is the only intended or desired. The general idea that the admin should be able to customize (from the configuration level) the cache directory seems like a reasonable thing to ask for, especially given the other locations that are customizable.
But the difference is that CACHE is a per user thing, not a system thing.
Why is that? I don't see why, given that we are pulling layers that are shared across images (e.g. docker) it wouldn't actually be more efficient for a cluster to have some setup with a shared cache.
For the reason I gave. A shared cache means that every user has access to the cache files, even if they don't have access to the source. They can access files they don't normally have access to.
Please see the original post, where the "user" is describing an admin role that doesn't want the flies going into /root. I still disagree that there is something special about the image layer cache when the layers are coming from a registry, and for singularity images, they are read only.
You're missing the point ... Here, a quick hack. I made /tmp/sing_cache/$USER the cache directory.
# mkdir -m 1777 /tmp/sing_cache
So ... general default umask is 022 (or worse, 002). That's gives you:
$ ls -l /tmp/sing_cache/
total 0
drwxr-xr-x 1 jason jason 20 Oct 23 18:22 jason/
Notice the "other" permission. That means that anyone can read it. So anyone can go into my user cache and poke at the files that are in there.
describing an admin role that doesn't want the flies going into /root.
And again... that's a single user
and for singularity images, they are read only.
The images when ran are read-only ... but the SIF file cache could be read by anyone. Is everyone on a system supposed to have access to every container that every other user has ran? The way you want this anyone would be able to read the cache files.
If a user doesn't want the cache in their home, then they can move its location outside the default. I thought there was a way to disable the cache... but maybe I just imagined that option existing.
An admin can set this if they want. I just think it's an awful idea, and can get you into compliance issues with you sharing the cache between every user, whether they have permission to the contents of those files or not.
But then if you are saying "it's a bad idea to have a shared cache..." isn't it a bad idea that if an admin (or multiple admins) are issuing build they are using a shared cache at /root/.singularity? I think @fbartusch needs to comment about the issue, because you are addressing a different use case than his original issue.
using a shared cache at /root/.singularity
It can't be a shared cache ... I'm betting it's because of things like:
sudo singularity build [...]
That's not a shared cache, but because sudo is being used. It's still the cache for just the root user.
@fbartusch It looks like you can do this with an environment variable, but as @vsoch discusses maybe it would be useful to have some settings in the config. Could you describe your use case in a bit more detail? Thanks!
@jscook2345 Just saw this discussion. I'm a bit surprised because the issue was rather old ... The intention of this issue back in the days was that we tested Singularity on our cluster, built some images, tested things. Then we saw that there are caches in root's home and we didn't want that this happens in root's home.
We didn't set SINGULARITY_CACHEDIR until today, so the caches are created in the user's home directory. Our homes are regularly backuped, so this is not really optimal for us, but we lived with it. As not so many users use Singularity at the moment, this is not a big problem for us.
If a user doesn't want the cache in their home, then they can move its location outside the default. I thought there was a way to disable the cache... but maybe I just imagined that option existing.
I just looked into the documentation, there is an environment variable for this: SINGULARITY_DISABLE_CACHE
If more and more users use Singularity (I hope this will happen) we will have a problem with the backup of the caches. Either we disable the caching, or we can set the cache directory to an location other than HOME.
@vsoch
You're missing the point ... Here, a quick hack. I made /tmp/sing_cache/$USER the cache directory.
mkdir -m 1777 /tmp/sing_cache
So ... general default umask is 022 (or worse, 002). That's gives you:
$ ls -l /tmp/sing_cache/ total 0 drwxr-xr-x 1 jason jason 20 Oct 23 18:22 jason/
Notice the "other" permission. That means that anyone can read it. So anyone can go into my user cache and poke at the files that are in there.
I think this would be the best solution, if Singularity creates the cache directory for the user automatically with the strictest permissions. This means just the USER can access/read/write his 'own' cache, other users have no permissions on other caches.
The problem for us in the future will be, that our home directories are backuped. If more user will use Singularity we either disable the cache, or use the above solution. Please notice that this problem I mention now is not the same why I opened this issue a year ago.
Hello,
This is a templated response that is being sent out to all open issues. We are working hard on 'rebuilding' the Singularity community, and a major task on the agenda is finding out what issues are still outstanding.
Please consider the following:
- Is this issue a duplicate, or has it been fixed/implemented since being added?
- Is the issue still relevant to the current state of Singularity's functionality?
- Would you like to continue discussing this issue or feature request?
Thanks, Carter
If this truly isn’t done (I’m not sure) it’s relatively low hanging fruit and would be useful to have for an admin setting up an install.
This issue has been automatically marked as stale because it has not had activity in over 60 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
Don't close stalebot, the contributors to this issue have responded and it's the maintainers that have not.
@fbartusch We're looking into the issue carefully, soon will bring to community and discuss ways to better solve as well address this. Thankyou for keeping the interest in the subject.
Ols singularity repo issues are ported into new apptainer repo https://github.com/apptainer/apptainer/issues/625