singularity icon indicating copy to clipboard operation
singularity copied to clipboard

Allow Singularity to run if no username is defined in /etc/passwd

Open rptaylor opened this issue 3 years ago • 12 comments

Version of Singularity:

3.7.2

Expected behavior

Singularity could run when there is no username.

Actual behavior

$ echo $HOME
/scratch/home
bash-4.2$ id
uid=10700 gid=10000 groups=10000
bash-4.2$ whoami
whoami: cannot find name for user ID 10700

/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity --help
WARNING: Could not lookup the current user's information: user: unknown userid 10700
FATAL:   Couldn't determine user account information: user: unknown userid 10700

What OS/distro are you running

registry.hub.docker.com/atlasadc/atlas-grid-centos7 (CentOS 7.8 container image)

How did you install Singularity

Using /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity

I know this is probably the currently intended behaviour and not considered a bug, but I would like to make a feature request to improve the behaviour so that Singularity can be used in a more portable way and on more container platforms without running into issues - particularly in Kubernetes and Docker containers. In (Kubernetes) containers you can select any UID to run a process in your container. The UID is generally arbitrary and meaningless , and the username is even more more arbitrary and meaningless so it is often left out. The UID is just a number and you can generally pick anything between 1 and at least 65K. Some k8s clusters may have Pod Security Policies in place which limit the available UID range but usually any sufficiently large number (e.g. >0 so not root, or > 1000 so not a system UID) works.

Sometimes a user account is specified in a container image, e.g. when built from a Dockerfile. However this embeds the user account into the image (in /etc/passwd) at container build time, which could create a problem at run time if it does not match the allowed range on a cluster, which could cause portability problems trying to run some images on some clusters. The most compatible and container-native approach is to update application code (e.g. if getpwuid() fails just get the UID/GID instead) to stop relying on usernames , so that arbitrary UIDs can be used without defined names, so when you compose your container application to run on Kubernetes , you are free to use whatever UID you want (or not specify one at all and just get a default UID).

For what I'm currently trying to do, to work around this I would have to take the image for the application I'm working with, start it up and do useradd with a random UID and username, save my modified copy and upload it to a container registry, and then make sure that my k8s pod YAML specifies the same UID that I added to the image. Especially if I want to use an image that is maintained and frequently updated by someone else, this adds a lot of extra overhead to the workflow if I want to use Singularity in my container, compared to just directly using the available image on Dockerhub or elsewhere.

What does Singularity need a username for? My container already has everything else it needs, including a writeable $HOME directory. If some descriptive string is needed can Singularity use something like "undefined_XXXX" where XXXX is the UID?

Thanks!

rptaylor avatar Apr 26 '21 23:04 rptaylor

Are user accounts in files, and if so, what is in your passwd and group files?

grep 10700 /etc/passwd
grep 10700 /etc/group

gmkurtzer avatar Apr 27 '21 03:04 gmkurtzer

Hi,

No there is no user info embedded in the /etc/passwd or /etc/group files in the container image. Usually it will be only the minimal set of system accounts defined there which comes from the container image.

The situation with GIDs is the same: completely arbitrary, you just specify whatever primary and supplementary GIDs you want to have at run time in your pod spec and that is what will run in the pod. You can see your GIDs with the id command and groups also shows the GID but there is no group name.

bash-4.2$ id
uid=10000 gid=10000 groups=10000

bash-4.2$ groups
groups: cannot find name for group ID 10000
10000

Thanks!

rptaylor avatar Apr 27 '21 17:04 rptaylor

So if I'm understanding what you are trying to do...

You have K8s (or Docker), and you are spinning up a container using that, with an arbitrary UID/GID, and from within there, you wish to run Singularity? And just to confirm, there are no entries in passwd or group for the given UID/GID the process is running with?

If that is the case, I think the easiest path forward would be to add those pieces to the appropriate files where you are running Singularity. The less easy path forward is to find all of the places where this could fail in Singularity and provide an alternate code pathway to get the required information. I cringe at that option.

You've also made me quite curious. What kind of a use-case are you solving for by running Singularity in this way?

Greg

gmkurtzer avatar Apr 27 '21 18:04 gmkurtzer

Hi Greg,

Yes that is correct.

If that is the case, I think the easiest path forward would be to add those pieces to the appropriate files where you are running Singularity.

By pieces do you mean adding user info to /etc/passwd and /etc/group? That is a privileged operation so it can't be done at runtime in the containers when invoking Singularity. It would require making modified container images with arbitrary user accounts injected at build time as I described, which adds overhead especially for container images that are frequently updated.

The use case is Kubernetes batch computing resources for HEP experiments. Unfortunately HEP workloads are not completely and cleanly encapsulated in the sense that the payload is a container and you can just run the container. Rather, the payload is a bundle of scripts that does some preparation and makes some decisions about how to launch container(s). That is, the payload is not a container, it is a thing that needs to launch its own containers. (Aside from actual HEP jobs, another example is the hep-score benchmark.) This approach makes sense considering the history of HEP experiments running on batch systems, and it is the same reason that Singularity fits with that approach (allowing users to launch containers on HPCs). However this clashes a bit with the containerization approach used by the wider tech world, where app = container. (This also has implications for how and where user ID management happens). So that is why we need Singularity containers inside kubernetes pods and I am trying to bridge the gap between these two approaches and eliminate some barriers. One of the advantages of Singularity (even compared to Podman in some aspects) is that it can be used to start a container quite easily in many scenarios with no privileges (being able to use it to make nested containers at all in Kubernetes is great) so IMHO it would be nice to expand on that advantage by increasing the number of cases where Singularity "just works".

I think/hope it would not be actually that cringe-inducing to fix. I might be able to help with a MR despite not knowing much golang. I will post more soon with a look at the code.

rptaylor avatar Apr 27 '21 21:04 rptaylor

Part 1: possible changes in Singularity

Currently the warnings/failures occur here: https://github.com/hpcng/singularity/blob/master/pkg/syfs/syfs.go#L48 This code is slightly over-ambitious in the sense that it does a full lookup of user info - but only actually needs the homedir, which does exist and is set via $HOME. For that matter if it only needs to find a location for a config dir it might be better off to use https://golang.org/pkg/os/#UserConfigDir instead of concatenating other paths together. In any case it appears to proceed using CWD instead (?) with only a warning.

https://github.com/hpcng/singularity/blob/master/cmd/internal/cli/singularity.go#L220 This is where the failure occurs. This getCurrentUser() function appears to be a suitable wrapper for user.Current, possibly an opportunity where error-handling code could be introduced in only one place, and all other parts of Singularity code could call getCurrentUser() instead of user.Current().
There appears to be only a small handful of other places that would then need to be changed to call getCurrentUser() instead:

$ grep "user.Current("  -r singularity/
singularity/internal/pkg/runtime/engine/singularity/prepare_linux.go:	pw, err := user.Current()
singularity/cmd/internal/cli/singularity.go:	usr, err := user.Current()
singularity/cmd/internal/cli/startvm_darwin.go:	usr, err := user.Current()
singularity/pkg/syfs/syfs.go:	user, err := user.Current()
singularity/pkg/syfs/syfs.go:	if cu, err := user.Current(); err == nil && u.Username == cu.Username {

This is what is expected from user.Current: https://golang.org/pkg/os/user/#Current

In any case the error handling code that could hypothetically be introduced into getCurrentUser() could call the following functions to retrieve the required info if user.Current() fails: https://golang.org/pkg/os/#UserHomeDir https://golang.org/pkg/os/#Getgid https://golang.org/pkg/os/#Getuid If the username is indeed needed, the code could look up $USER (which I could set to some random bogus value) and possibly the code could set a bogus value if it is not defined. Then all the info needed to construct a User struct would be available, acquired by this alternative method if user.Current() fails, and the getCurrentUser() could return the result.

rptaylor avatar Apr 27 '21 23:04 rptaylor

Side note: perhaps I am mistaken but this appears to be checking user's capability based on their GECOS display name (which could include spaces and other special characters etc) as opposed to the username? https://github.com/hpcng/singularity/blob/master/internal/pkg/runtime/engine/singularity/prepare_linux.go#L226

rptaylor avatar Apr 27 '21 23:04 rptaylor

Part 2: it would be nice to find a solution in Go without having to change Singularity code. In particular if I could just export USER=whatever , but in my current test environment I get the same result if I do that.

user.Current() https://github.com/golang/go/blob/master/src/os/user/lookup.go#L14

calls current() https://github.com/golang/go/blob/master/src/os/user/lookup_stubs.go#L22 which appears to get Username from os.Getenv("USER").

or if cgo: https://github.com/golang/go/blob/master/src/os/user/cgo_lookup_unix.go#L50 https://github.com/golang/go/blob/master/src/os/user/cgo_lookup_unix.go#L91 not sure of the details there but I think it does the equivalent of getpwuid_r, reading the passwd file.

Based on https://github.com/golang/go/issues/38599 it sounds like if Singularity is built with CGO_ENABLED=0 (not sure of the consequences of that) I could get away with the $USER workaround? But if that is the case I would argue the same behaviour should result regardless of how Singularity is compiled, so if defining $USER is an acceptable workaround in one case , wouldn't it also be acceptable to modify the code so that is also possible when cgo is enabled?

rptaylor avatar Apr 28 '21 00:04 rptaylor

Similar to #5598 and #5757

rptaylor avatar Apr 28 '21 00:04 rptaylor

Hello,

This is a templated response that is being sent out to all open issues. We are working hard on 'rebuilding' the Singularity community, and a major task on the agenda is finding out what issues are still outstanding.

Please consider the following:

  1. Is this issue a duplicate, or has it been fixed/implemented since being added?
  2. Is the issue still relevant to the current state of Singularity's functionality?
  3. Would you like to continue discussing this issue or feature request?

Thanks, Carter

carterpeel avatar May 15 '21 16:05 carterpeel

Yes, still relevant.

rptaylor avatar May 17 '21 18:05 rptaylor

This issue has been automatically marked as stale because it has not had activity in over 60 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 16 '21 18:07 stale[bot]

Still interested in a way to do this.

rptaylor avatar Jul 16 '21 18:07 rptaylor

Moved to apptainer/apptainer#1066

DrDaveD avatar Feb 08 '23 19:02 DrDaveD