apptainer icon indicating copy to clipboard operation
apptainer copied to clipboard

Singularity should allow any user to build from a definition file.

Open DrDaveD opened this issue 2 years ago • 22 comments

Continuation of apptainer/singularity#5941.

DrDaveD avatar Feb 04 '22 18:02 DrDaveD

Cc: @preney

ShamrockLee avatar Feb 05 '22 10:02 ShamrockLee

While important but separate from this issue, once this issue is (hopefully) addressed the next issue to examine and resolve are default/auto-bind-mounted hard-coded paths, e.g., see apptainer/apptainer#205.

preney avatar Feb 05 '22 20:02 preney

@preney If the decision is made to allow unprivileged users to build a definition file, how are we going implement it?

Would it be the default or be opted-in with a flag? If we use a flag, how should singularity behave when root privileges are detected? How do we ensure reproducibility?

ShamrockLee avatar Feb 06 '22 09:02 ShamrockLee

I feel that the default should be to allow users to build with a definition file.

Syntax could be added to the definition file to assert it requires root privileges and so running such without those perms may or may not work properly or be reproducible. It is also possible to have the "opposite" syntax, i.e., syntax that asserts the definition file can be run without root permissions. Given there are existing definition files the latter might be preferred. (Either way should work.)

That said, it might not be unreasonable to have "version" syntax in the definition file to allow dealing with past semantics, permitting things to evolve forwards, e.g., if breaking some backwards compatibility. The lack of version syntax would imply the original rules and the version syntax would imply whatever that version requires.

I am thinking a version syntax idea is probably a good one since the existing definition file syntax IMHO isn't really all that reproducible or "secure": it heavily depends on the existing shell and the commands that are available on the system being used and IMHO should really evolve to be shell and command independent and/or formally assert required shell/commands that can be check prior to running the definition file. Getting to such a point would likely take place over a number of "syntax versions" as that would not happen overnight.

preney avatar Feb 06 '22 20:02 preney

Here is a working prototype I made with some Geteuid() != 0 checks removed:

https://github.com/ShamrockLee/apptainer/tree/noroot

ShamrockLee avatar Feb 11 '22 09:02 ShamrockLee

@ShamrockLee that's encouraging. Could you please create a PR out of it so it's easy to see the code differences? Mark it as a draft pull request so it won't be considered yet for merging if you're not ready for that.

DrDaveD avatar Feb 11 '22 19:02 DrDaveD

Thank you.

I have opened the PR. Could you help me approve the CI?

ShamrockLee avatar Feb 12 '22 13:02 ShamrockLee

@DrDaveD A lot has been happening with getting version 1.0 out no doubt. Has the PR been tested yet?

preney avatar Mar 17 '22 01:03 preney

I believe @ShamrockLee is still working on it

DrDaveD avatar Mar 17 '22 03:03 DrDaveD

The PR can now build singurarity images without root privileges when passing the --unprivileged flag to apptainer build.

However, apptainer build would still mount some admin-controlled files such as /etc/resolv.d, which causes problems in some restricted build environments (e.g. during the build process of a Nix package), and should be able to opt-out.

@cclerget suggests that we can implement the build-time version of --no-mount to opt-out specific default mounting, but I haven't figure out how it is implemented for the run time and where the default mountings are done.

Could anyone offer me some clues about where to start?

ShamrockLee avatar Mar 17 '22 04:03 ShamrockLee

I think this would likely become unnecessary if #447 is implemented.

DrDaveD avatar May 12 '22 19:05 DrDaveD

I think this would likely become unnecessary if #447 is implemented.

I'll see if it works under the standard build environment of Nix without building inside a VM. I have a feeling that the unshare part might fail.

ShamrockLee avatar May 12 '22 23:05 ShamrockLee

I think this would likely become unnecessary if #447 is implemented.

I'll see if it works under the standard build environment of Nix without building inside a VM. I have a feeling that the unshare part might fail.

Try it. It's a standard feature on modern Linux kernels. If the unshare command is missing that's OK, the apptainer implementation will directly use the system calls.

DrDaveD avatar May 13 '22 15:05 DrDaveD

@DrDaveD The unshare and fakeroot works like magic on x86_64-linux! Need to test it on aarch64-linux also.

However, build-image-time --no-mount is still required to make it work inside the Nixpkgs build environment and other restricted environments where users doesn't have access to or cannot change the content of /etc and /var.

By the way, can someone build Apptainer images when the unprivileged Linux namespace functionality is not available? Some distros/platforms opt it out by default, which will cause this approach fail.

ShamrockLee avatar May 13 '22 17:05 ShamrockLee

@ShamrockLee Now that #475 is merged, please try building from the apptainer main branch and say whether or not it solves your problem.

DrDaveD avatar Jun 02 '22 14:06 DrDaveD

Oh I see I didn't answer your last comment.

However, build-image-time --no-mount is still required to make it work inside the Nixpkgs build environment and other restricted environments where users doesn't have access to or cannot change the content of /etc and /var.

That shouldn't be a problem during an apptainer build because that operates on the container where the user does have full access.

By the way, can someone build Apptainer images when the unprivileged Linux namespace functionality is not available? Some distros/platforms opt it out by default, which will cause this approach fail.

No, as implemented in #475 unprivileged user namespaces are required. I think it may be possible to redo it to use only fakeroot, but I'm not yet convinced it is worth doing because so many other things also require unprivileged user namespaces. In fact, Apptainer 1.1.0 is planned to not enable setuid by default, and people who install from pre-built packages will need to install an additional package to get it. We think that system administrators should enable unprivileged user namespaces when it is not enabled by default. We do recommend that if they can that they then disable network namespaces, because all the recent CVEs related to unprivileged namespaces have required the combination of unprivileged user namespaces and network namespaces.

DrDaveD avatar Jun 02 '22 14:06 DrDaveD

I think it may be possible to redo it to use only fakeroot

I just spent a little time trying to implement that but ran into a roadblock. It's essentially the same problem that @cclerget discussed in his comment on your PR. The apptainer build command is implemented using a nested call to apptainer for the %post section (and %test section) in order to set up the container environment, and it uses a custom configuration file as it does that. Using a custom configuration is of course a privileged operation because it would be a severe security hole to allow that to be done for a setuid installation. So that's only allowed with unprivileged user namespaces or when running as the root user.

It looks like what it is doing is setting default config options except for disabling mounts of home, devpts, resolv.conf, and the list of bind paths. I see --no-mount options for the first two but not the latter two. As Cedric said, additional --no-mount options would need to be added for those. He also talked about disabling the mount of passwd & group, and I don't yet see where that's currently happening. Then as he said there would need to be additional error checks for other non-default config settings that could interfere.

So it's a pretty big can of worms. Cedric, what do you think about instead having a single privileged option that uses the exact configuration needed for builds?

DrDaveD avatar Jun 02 '22 17:06 DrDaveD

I went ahead and did an implementation in #481 using that idea, adding a hidden --build-config option which allows the nested apptainer command to use the build-specific configuration even when running privileged. I found I also had to imply the --fix-perms option when no unprivileged user namepaces were available, because even using the fakeroot command did not allow writing to directories that are unwritable by owner, where unshare -r does allow that.

DrDaveD avatar Jun 02 '22 22:06 DrDaveD

@DrDaveD It seems that the build still relies on <localstatedir>/apptainer/mnt/session to work. Is there a way to specify those directory at run time?

ShamrockLee avatar Jun 23 '22 20:06 ShamrockLee

There is not, but it is only used as a bind mount point inside of containers so it isn't expected to be a problem. What kind of problem does it cause?

DrDaveD avatar Jun 23 '22 21:06 DrDaveD

As a Nix user and Nixpkgs contributor, I hope to generate Apptainer images just as normal Nix packages. We actually have a singularity-tool in Nixpkgs to do so. The build is currently proceed inside a QEMU VM due to the need of root privileges and non-local directories. As Apptainer is now capable to run by unprivileged user, I hope to build an image without the VM.

ShamrockLee avatar Jun 24 '22 05:06 ShamrockLee

The original reason I created this issue with Singularity which has been carried forward to Apptainer was to be able to create Gentoo and/or Nix containers without root using a definition file as such can be done completely without root access. This allows/enables users to roll their own custom images without root, e.g., in HPC environments. (Unfortunately the mere use of definition files in Singularity required root access hence the original reason for this issue.)

preney avatar Jun 24 '22 07:06 preney

I'm sorry I forgot about this issue. I believe it was addressed in apptainer-1.1.0, in #475 and following. Closing this issue. Reopen if you think something is still lacking.

DrDaveD avatar Mar 16 '23 17:03 DrDaveD

@DrDaveD The last obstacle on the way to fully-unprivileged image-building workflow is the dependence on <localstatedir>/apptainer/mnt/session. Users cannot set it up without the root privilege, and Nix build expressions are not allowed to access those top-level directories.

ShamrockLee avatar Apr 03 '23 18:04 ShamrockLee

I went ahead and did an implementation in #481 using that idea, adding a hidden --build-config option which allows the nested apptainer command to use the build-specific configuration even when running privileged.

Sorry for bothering, but apptainer still keeps failing when /etc/resolv.conf isn't presented, even when using --build-config and --config apptaner.conf with the line config resolv_conf = no.

FATAL:   While performing build: failed to read /etc/resolv.conf: open /etc/resolv.conf: no such file or directory

ShamrockLee avatar Apr 05 '23 23:04 ShamrockLee

Those sound like two new issues not directly related to the main topic of this issue. Please create new github issues with full info on how to reproduce, etc.

DrDaveD avatar Apr 10 '23 19:04 DrDaveD

Edit - to reconsider this we'd need to have evidence of a number of builds that explicitly are possible without root privilege, and that are being performed regularly by multiple users.

Anything built with conda

From: continuumio/miniconda3
%post
    conda install -c bioconda your-package-here

in fact this is by far the most common build scenario I use, and it does not require root. Yet its not possible because

$ singularity build --fakeroot mycontainer.sif mycontainer.def
FATAL:   could not use fakeroot: no mapping entry found in /etc/subuid for ec2-user

stevekm avatar Sep 12 '23 01:09 stevekm

Sorry @stevekm. It sounds like user namespaces are not fully implemented/ improperly configured on your system. Please see the following.

https://apptainer.org/docs/user/latest/fakeroot.html#fakeroot https://apptainer.org/docs/admin/1.2/user_namespace.html

GodloveD avatar Sep 12 '23 14:09 GodloveD

I am just trying to build a Singularity container (which installs packages with conda) from a .def definition file on an AWS EC2 instance. I would think this would be a pretty simple situation to support. Its not clear to me from those docs what the solution is for this. Thanks.

stevekm avatar Sep 12 '23 15:09 stevekm

Based on these docs you could try this:

sudo apptainer config fakeroot --add $USER

This will try to intelligently add the appropriate entry into /etc/subuid for your user. But it depends also on your host OS in your AWS instance and whether there is User Namespace support.

GodloveD avatar Sep 12 '23 15:09 GodloveD