apptainer
apptainer copied to clipboard
Singularity should allow any user to build from a definition file.
Continuation of apptainer/singularity#5941.
Cc: @preney
While important but separate from this issue, once this issue is (hopefully) addressed the next issue to examine and resolve are default/auto-bind-mounted hard-coded paths, e.g., see apptainer/apptainer#205.
@preney If the decision is made to allow unprivileged users to build a definition file, how are we going implement it?
Would it be the default or be opted-in with a flag? If we use a flag, how should singularity behave when root privileges are detected? How do we ensure reproducibility?
I feel that the default should be to allow users to build with a definition file.
Syntax could be added to the definition file to assert it requires root privileges and so running such without those perms may or may not work properly or be reproducible. It is also possible to have the "opposite" syntax, i.e., syntax that asserts the definition file can be run without root permissions. Given there are existing definition files the latter might be preferred. (Either way should work.)
That said, it might not be unreasonable to have "version" syntax in the definition file to allow dealing with past semantics, permitting things to evolve forwards, e.g., if breaking some backwards compatibility. The lack of version syntax would imply the original rules and the version syntax would imply whatever that version requires.
I am thinking a version syntax idea is probably a good one since the existing definition file syntax IMHO isn't really all that reproducible or "secure": it heavily depends on the existing shell and the commands that are available on the system being used and IMHO should really evolve to be shell and command independent and/or formally assert required shell/commands that can be check prior to running the definition file. Getting to such a point would likely take place over a number of "syntax versions" as that would not happen overnight.
Here is a working prototype I made with some Geteuid() != 0
checks removed:
https://github.com/ShamrockLee/apptainer/tree/noroot
@ShamrockLee that's encouraging. Could you please create a PR out of it so it's easy to see the code differences? Mark it as a draft pull request so it won't be considered yet for merging if you're not ready for that.
Thank you.
I have opened the PR. Could you help me approve the CI?
@DrDaveD A lot has been happening with getting version 1.0 out no doubt. Has the PR been tested yet?
I believe @ShamrockLee is still working on it
The PR can now build singurarity images without root privileges when passing the --unprivileged
flag to apptainer build
.
However, apptainer build
would still mount some admin-controlled files such as /etc/resolv.d
, which causes problems in some restricted build environments (e.g. during the build process of a Nix package), and should be able to opt-out.
@cclerget suggests that we can implement the build-time version of --no-mount
to opt-out specific default mounting, but I haven't figure out how it is implemented for the run time and where the default mountings are done.
Could anyone offer me some clues about where to start?
I think this would likely become unnecessary if #447 is implemented.
I think this would likely become unnecessary if #447 is implemented.
I'll see if it works under the standard build environment of Nix without building inside a VM. I have a feeling that the unshare
part might fail.
I think this would likely become unnecessary if #447 is implemented.
I'll see if it works under the standard build environment of Nix without building inside a VM. I have a feeling that the
unshare
part might fail.
Try it. It's a standard feature on modern Linux kernels. If the unshare command is missing that's OK, the apptainer implementation will directly use the system calls.
@DrDaveD The unshare
and fakeroot
works like magic on x86_64-linux! Need to test it on aarch64-linux also.
However, build-image-time --no-mount
is still required to make it work inside the Nixpkgs build environment and other restricted environments where users doesn't have access to or cannot change the content of /etc
and /var
.
By the way, can someone build Apptainer images when the unprivileged Linux namespace functionality is not available? Some distros/platforms opt it out by default, which will cause this approach fail.
@ShamrockLee Now that #475 is merged, please try building from the apptainer main branch and say whether or not it solves your problem.
Oh I see I didn't answer your last comment.
However, build-image-time
--no-mount
is still required to make it work inside the Nixpkgs build environment and other restricted environments where users doesn't have access to or cannot change the content of/etc
and/var
.
That shouldn't be a problem during an apptainer build
because that operates on the container where the user does have full access.
By the way, can someone build Apptainer images when the unprivileged Linux namespace functionality is not available? Some distros/platforms opt it out by default, which will cause this approach fail.
No, as implemented in #475 unprivileged user namespaces are required. I think it may be possible to redo it to use only fakeroot, but I'm not yet convinced it is worth doing because so many other things also require unprivileged user namespaces. In fact, Apptainer 1.1.0 is planned to not enable setuid by default, and people who install from pre-built packages will need to install an additional package to get it. We think that system administrators should enable unprivileged user namespaces when it is not enabled by default. We do recommend that if they can that they then disable network namespaces, because all the recent CVEs related to unprivileged namespaces have required the combination of unprivileged user namespaces and network namespaces.
I think it may be possible to redo it to use only fakeroot
I just spent a little time trying to implement that but ran into a roadblock. It's essentially the same problem that @cclerget discussed in his comment on your PR. The apptainer build command is implemented using a nested call to apptainer for the %post section (and %test section) in order to set up the container environment, and it uses a custom configuration file as it does that. Using a custom configuration is of course a privileged operation because it would be a severe security hole to allow that to be done for a setuid installation. So that's only allowed with unprivileged user namespaces or when running as the root user.
It looks like what it is doing is setting default config options except for disabling mounts of home, devpts, resolv.conf, and the list of bind path
s. I see --no-mount options for the first two but not the latter two. As Cedric said, additional --no-mount
options would need to be added for those. He also talked about disabling the mount of passwd & group, and I don't yet see where that's currently happening. Then as he said there would need to be additional error checks for other non-default config settings that could interfere.
So it's a pretty big can of worms. Cedric, what do you think about instead having a single privileged option that uses the exact configuration needed for builds?
I went ahead and did an implementation in #481 using that idea, adding a hidden --build-config
option which allows the nested apptainer command to use the build-specific configuration even when running privileged. I found I also had to imply the --fix-perms
option when no unprivileged user namepaces were available, because even using the fakeroot command did not allow writing to directories that are unwritable by owner, where unshare -r
does allow that.
@DrDaveD It seems that the build still relies on <localstatedir>/apptainer/mnt/session
to work. Is there a way to specify those directory at run time?
There is not, but it is only used as a bind mount point inside of containers so it isn't expected to be a problem. What kind of problem does it cause?
As a Nix user and Nixpkgs contributor, I hope to generate Apptainer images just as normal Nix packages. We actually have a singularity-tool
in Nixpkgs to do so. The build is currently proceed inside a QEMU VM due to the need of root privileges and non-local directories. As Apptainer is now capable to run by unprivileged user, I hope to build an image without the VM.
The original reason I created this issue with Singularity which has been carried forward to Apptainer was to be able to create Gentoo and/or Nix containers without root using a definition file as such can be done completely without root access. This allows/enables users to roll their own custom images without root, e.g., in HPC environments. (Unfortunately the mere use of definition files in Singularity required root access hence the original reason for this issue.)
I'm sorry I forgot about this issue. I believe it was addressed in apptainer-1.1.0, in #475 and following. Closing this issue. Reopen if you think something is still lacking.
@DrDaveD The last obstacle on the way to fully-unprivileged image-building workflow is the dependence on <localstatedir>/apptainer/mnt/session
. Users cannot set it up without the root privilege, and Nix build expressions are not allowed to access those top-level directories.
I went ahead and did an implementation in #481 using that idea, adding a hidden
--build-config
option which allows the nested apptainer command to use the build-specific configuration even when running privileged.
Sorry for bothering, but apptainer still keeps failing when /etc/resolv.conf
isn't presented, even when using --build-config
and --config apptaner.conf
with the line config resolv_conf = no
.
FATAL: While performing build: failed to read /etc/resolv.conf: open /etc/resolv.conf: no such file or directory
Those sound like two new issues not directly related to the main topic of this issue. Please create new github issues with full info on how to reproduce, etc.
Edit - to reconsider this we'd need to have evidence of a number of builds that explicitly are possible without root privilege, and that are being performed regularly by multiple users.
Anything built with conda
From: continuumio/miniconda3
%post
conda install -c bioconda your-package-here
in fact this is by far the most common build scenario I use, and it does not require root. Yet its not possible because
$ singularity build --fakeroot mycontainer.sif mycontainer.def
FATAL: could not use fakeroot: no mapping entry found in /etc/subuid for ec2-user
Sorry @stevekm. It sounds like user namespaces are not fully implemented/ improperly configured on your system. Please see the following.
https://apptainer.org/docs/user/latest/fakeroot.html#fakeroot https://apptainer.org/docs/admin/1.2/user_namespace.html
I am just trying to build a Singularity container (which installs packages with conda) from a .def definition file on an AWS EC2 instance. I would think this would be a pretty simple situation to support. Its not clear to me from those docs what the solution is for this. Thanks.
Based on these docs you could try this:
sudo apptainer config fakeroot --add $USER
This will try to intelligently add the appropriate entry into /etc/subuid
for your user. But it depends also on your host OS in your AWS instance and whether there is User Namespace support.