docker-stacks Revisit root permissions and entrypoint for user management

In start.sh, we support setting up the user in a variety of ways that currently require the container to be started as root. #552, #553, #559 introduce a mechanism to handle a subset of those cases when the user is in the root group, as is typical with openshift. #553 initially did so in an ENTRYPOINT instead of CMD in order to benefit even when CMD is overridden, e.g. directly to jupyterhub-singleuser in kubespawner. That logic now resides in start.sh with everything else (#559), but perhaps much of start.sh ought to move to an ENTRYPOINT instead of the CMD.

Things to consider:

how much can/should we accomplish with the root group instead of the root user? I'd love it if we could stop requiring folks to launch the image as the root user, since a slip-up there can be a major permission issue
how much of the permissions/user setup belongs in an ENTRYPOINT rather than start.sh? Pro of ENTRYPOINT is that overriding CMD preserves permissions/user-management logic. Con of ENTRYPOINT is that opting out of user-management behavior is a bit harder if desired. So if we move things to the entrypoint, we should make sure that default behavior is always safe and almost always desirable.

Current capabilities in start.sh that require root permissions:

update $HOME to /home/$NB_USER if username is set specifically
rename jovyan user to $NB_USER if set
set UID/GID of NB_USER
grant $NB_USER sudo permissions
chown $HOME (typically just the empty-directory volume case)

Feb 20 '18 12:02 minrk

I've stalled on creating the ReadTheDocs site, partially because I'm at the section of the doc where I have to document all the various options the images support. It's daunting.

So I was planning to open an issue this morning to discuss what options we currently support (e.g., NB_UID), why they were originally needed, and if they're still needed given the usage patterns that have emerged.

Then I saw this new ticket about root permissions (thanks @minrk !) and wonder if we should tackle the two topics together:

Can we shed any of the options we currently support?
How much do we really need to do in entrypoint / cmd that requires root?

Feb 20 '18 14:02 parente

I'm currently using https://github.com/GrahamDumpleton/openshift3-jupyter-stacks to be able to deploy on OpenShift. However, since the dockerfile there specifies changes to /home/$NB_USER, it results in an additional ~3GB layer to the base image, just because it's changing permissions.

I can imagine if this enhancement is resolved, then these images would run natively on OpenShift. The biggest challenge in doing this seems to be the requirement to run as root since some of the current capabilities require it. However, there's not much clarity as per why those features are required in the first place.

Can someone shed some light into why it's required to be able to run as a different user?

Apr 23 '18 11:04 aneagoe

@GrahamDumpleton made quite a few contributions to this repository lately, including documentation about using these images on OpenShift 3 (https://github.com/jupyter/docker-stacks/tree/master/examples/openshift). Perhaps the problem has been mitigated already?

Apr 23 '18 13:04 parente

The docker-stacks images don't have all the changes I would like to see, but based on reactions have got, I doubt that will ever be able to get the last change I want done that would allow the images to run on OpenShift out of the box without any special configuration. That one change I would like to still have made is having NB_GID be 0 and not that of users. So by default notebook user would run with gid of 0 and all directories and files it needs to write to being gid of 0 as well.

So have done as much as I can. They will run, but you need to configure OpenShift, the Kubernetes environment with multi user RBAC setup, or any other more secure Docker environment, to run the containers with supplemental group of 100, being that for users.

As pointed out, see the examples directory. For Source-to-Image builds there is also:

https://github.com/jupyter/docker-stacks/tree/master/examples/source-to-image

I am gradually pulling all information together about running Jupyter on OpenShift at:

https://www.jupyteronopenshift.org/

It is a self signed certificate on the site for now, but that will be fixed.

There is also lots of other repos and examples related to running JupyterHub, kernel gateways, nbviewer, native OpenShift S2I based notebooks etc., at:

https://github.com/jupyter-on-openshift

Apr 23 '18 21:04 GrahamDumpleton

That one change I would like to still have made is having NB_GID be 0 and not that of users. So by default notebook user would run with gid of 0 and all directories and files it needs to write to being gid of 0 as well.

Maybe my memory is faulty, but I remember some questions about whether using the root group could lead to security issues and some answers back that it looked OK to OpenShift folks. After that, I think the conversation trailed off and was lost when folks, myself included, suggested that changes be made one PR at a time.

I'm open to switching the default user gid to 0 as long as it doesn't regress other features of the images. I think @minrk is too from his comments at the start of this issue, albeit for the separate reason of trying to mitigate the need to launch containers as uid=0 for certain features.

Apr 24 '18 02:04 parente

If these images were only ever used as is there would be no issue changing. The problem is the effect on people who are creating derived images. You can’t assume that people when adding stuff in derived images are setting user/group based on NB_UID/NB_GID environment variables, if they are at least trying to match permissions. So it isn’t a security issue, just that derived images people are creating could be broken by changing the group to GID of 0.

Apr 24 '18 02:04 GrahamDumpleton

Sorry, I should have been clearer. By "regress other features" I meant "break without the ability to fix".

Documenting best practices for deriving images from the core definitions here, especially the base and minimal images, is fodder for the ReadTheDocs site slowly being worked in #517 and associated PRs. If we document examples of how to properly set permissions in derived images (e.g., fix-permissions script) and we tag the last image to use the gid=100 setting for users that aren't ready to make the switch quite yet, I think breaking backward compatibility is fair game.

Apr 24 '18 03:04 parente

@GrahamDumpleton thanks a lot for all your efforts with regards to getting jupyter on OpenShift, it was a huge help. It seems to be already better than I thought, I didn't know that just applying supplemental group 100 would work with existing upstream images. That makes it easier. Still, it would be ideal to run with gid=0 straight off.

Apr 24 '18 12:04 aneagoe

My main concern about defaulting to GID=0 is that it seems to be the wrong thing to do for the much more common case of people running these docker images outside of kubernetes/openshift-style locked-down environments. See this thread where running users as gid=0 with the current state of things makes it trivial to become the root user. Only in extra-secure contexts like openshift does it seem to be a safe and correct choice use run with gid=0. Since that's not the primary target case for these docker stacks, I'm not sure there's a good way to serve both.

I also want it to be clear that these stacks are far from the only way to install Jupyter, and we shouldn't be trying to make these stacks serve all use cases. I think the current level of configurability is a source of significant complications.

I suspect for cases like openshift with specific demands, using something like s2i to build images with Jupyter installed via conda and/or pip with specific choices for user/group permissions may well be the preferable, simpler solution to trying to make these stacks work optimally in all possible environments.

Apr 24 '18 13:04 minrk

Arrgggh. Totally forgot that issue was raised around use of su. With that PR closed and work moved to another, that probably didn't get discussed further as well as it should have at the time.

Apr 24 '18 22:04 GrahamDumpleton

Not sure yet what issues this may cause with existing functionality in the image which uses su and sudo, but you could restrict use of su to users in the wheel group.

RUN echo "auth           required        pam_wheel.so use_uid" >> /etc/pam.d/su

Result then is:

$ docker run -u 100000 --rm -it sutest
(app-root)id
uid=100000 gid=0(root) groups=0(root)
(app-root)cp /etc/passwd /tmp/passwd
(app-root)rootpass=`openssl passwd -1 root`
(app-root)cat /tmp/passwd | sed "s/root:x/root:${rootpass}/" > /etc/passwd
(app-root)
(app-root)su root
Password:
su: Error in service module
(app-root)id
uid=100000 gid=0(root) groups=0(root)

Apr 24 '18 23:04 GrahamDumpleton

You would though have to also rollback the change made to make /etc/group writable as if left that way someone could add themselves to group wheel anyway.

The addition of a corresponding entry to /etc/group for an assigned user ID not in /etc/passwd was not essential and I don't know of a specific case where lack of it would cause failure of Jupyter notebooks or other Python packages.

Apr 24 '18 23:04 GrahamDumpleton

As per comments on https://github.com/jupyter/docker-stacks/pull/653 it seems that what can do to lock down ability to run su is the following.

Remove group root write access to /etc/group that had previously been added to allow an entry to be added. This will prevent a random user ID without a /etc/passwd entry from being allowed to add them to extra groups, such as wheel.
Add a group wheel with gid of 11. This is free and would avoid problems with groups added for users which would be 100 or above. The group would have no members.
Enable pam_wheel.so in /etc/pam.d/su. With wheel group enabled, but empty, this means that the only user who can use su is the root user, being in group root is not enough.

End result is that when running as random user ID not in /etc/passwd, and so running with group root, doesn't matter that it can set a password for root in /etc/passwd as the user would be prohibited from running su as it isn't in the wheel group.

Relevant docs from pam_wheel manual page which explain role of wheel group is:

The pam_wheel PAM module is used to enforce the so-called wheel group. By default it permits root access to the system if the applicant user is a member of the wheel group. If no group with this name exist, the module is using the group with the group-ID 0.

Even with this, still would be a good idea to document that if running with random UID not in /etc/passwd file that drop capabilities for setuid using --cap-drop option to docker run, or in Kubernetes if that is being used and not dropped by default.

These changes do not appear to affect ability to sudo when running container as root and setting GRANT_SUDO.

Jun 06 '18 07:06 GrahamDumpleton

I have reworked my withdrawn PR and create a new one at https://github.com/jupyter/docker-stacks/pull/654 to lock down in what circumstances su can be executed.

Jun 07 '18 00:06 GrahamDumpleton

Despite #653, it still seems like it would be trivial to gain root access if the container is run with group root, e.g. --user 1234:0. Is that intended? And if not, what is the security-issue reporting process/email, and can we add that to the docs if not already there?

Mar 20 '19 23:03 javabrett

Is there a reason we can't just grant sudo root to jovyan for the purpose of running a single (hardened) script launched from start.sh, with that script (maybe call it run-as-root.sh) containing all the stuff requiring elevation at start-up? Could even delete that script after running it.

Feels like that would allow the container to always launch as non-root (user or group) but retain all the user-remapping, home-dir and chown goodies.

Mar 20 '19 23:03 javabrett

If someone already has the ability to override the startup command and supply --user 1234:0, then they could just as easily run the container as user root by doing --user 0. So you have bigger problems if they have the ability to run containers as root user itself anyway. Note that running as user root is very different to running as non root user and group root.

As you say it is 'trivial to gain root access', can you explain by what mechanism having group root with non root user will allow that?

Running as group root doesn't automatically grant a user any special privileges. The issue is usually around what files they can write to as group root and inadvertent effects that could have, such as was the case with making /etc/group writable and which was addressed by https://github.com/jupyter/docker-stacks/pull/654.

The running as group root as fallback would really only be relied upon in container platforms which have extra security mechanisms that means users are forced to run as arbitrary user IDs. In these same platforms, they would also drop linux capabilities for kernel operations required by su and sudo as well, so they would be blocked.

So the group root case only comes up for certain deployment methods where there are usually extra levels of protection which would guard against becoming root user in some unforeseen way. No one should be choosing to run with arbitrary user ID, and thus group root, without such extra protections also enabled.

So can you explain what you believe makes it trivial to become the root user, when running with group root but as non root user.

In the mean time, it could be worthwhile to document as general precaution, that in all cases where not relying on sudo or su capabilities, that Linux capabilities for being able to do su, and perhaps other kernel operations as well, be dropped, and provide an example of how to do that when running docker run. I am presuming this wasn't done since last time suggested documenting this.

Mar 21 '19 00:03 GrahamDumpleton

@GrahamDumpleton let's discuss via email or Gitter if you like (or something else, you choose), since it is a security question.

Mar 21 '19 01:03 javabrett

Or BlueJeans or phone since you appear to live in the same city. :-)

For email you can get me at: [email protected]

Mar 21 '19 02:03 GrahamDumpleton

Re: reporting vulnerabilities, I've started a thread on Discourse about how we might better indicate the preferred procedure https://discourse.jupyter.org/t/responsible-vulnerability-reporting/655.

Mar 31 '19 21:03 parente