zero-to-jupyterhub-k8s
zero-to-jupyterhub-k8s copied to clipboard
Someonething along the way goes wrong - jupyter/docker-stacks scripts didn't grant sudo permissions
Bug description
Trying out sudo commands in the jupyterlab session fails with the following message
sudo: effective uid is not 0, is /usr/bin/sudo on a filesystem with the 'nosuid' option set or an NFS file system without root privileges?
I am using the datascience-notebook from dockerhub as a profile:
singleuser:
image:
name: jupyterhub/k8s-singleuser-sample
tag: '1.1.3'
pullPolicy: IfNotPresent
profileList:
- display_name: "Default"
description: "Default"
kubespawner_override:
image: jupyter/datascience-notebook
I've followed other issues where the configuration changes helped, so I've reused them in my setup as follows:
singleuser:
extraEnv:
GRANT_SUDO: "yes"
NOTEBOOK_ARGS: "--allow-root"
uid: 0
fsGid: 0
cmd: start-singleuser.sh
This configuration takes effect during server boot up, as can be seen from the logs:
Set username to: jovyan
usermod: no changes
Granting jovyan sudo access and appending /opt/conda/bin to sudo PATH
Executing the command: jupyterhub-singleuser --allow-root --ip=0.0.0.0 --port=8888 --SingleUserNotebookApp.default_url=/lab
[W 2021-10-07 13:02:00.826 SingleUserNotebookApp configurable:193] Config option `open_browser` not recognized by `SingleUserNotebookApp`. Did you mean `browser`?
[I 2021-10-07 13:02:00.839 SingleUserNotebookApp notebookapp:1593] Authentication of /metrics is OFF, since other authentication is disabled.
[W 2021-10-07 13:02:01.453 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-10-07 13:02:01.454 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-10-07 13:02:01.454 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-10-07 13:02:01.454 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2021-10-07 13:02:01.463 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
...
Inside the session I also check for the presence of sudoers configuration and it is in place:
(base) jovyan@jupyter-lab:~$ ls /etc/sudoers.d
notebook path README
(base) jovyan@jupyter-lab:~$ cat /etc/sudoers.d/notebook
jovyan ALL=(ALL) NOPASSWD:ALL
(base) jovyan@jupyter-lab:~$ ls -la /etc/sudoers.d/notebook
-rw-r--r-- 1 root root 30 Oct 7 13:02 /etc/sudoers.d/notebook
(base) jovyan@jupyter-lab:~$ cat /etc/sudoers.d/path
Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/opt/conda/bin"
(base) jovyan@jupyter-lab:~$ id
uid=1000(jovyan) gid=100(users) groups=100(users)
But nevertheless all sudo commands fail with the same error message
(base) jovyan@jupyter-lab:~$ sudo -L
sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?
Expected behaviour
sudo working correctly with all commands
Actual behaviour
sudo does not work with the given configuration
Your personal set up
Running ZTJH helm chart v1.1.3 on baremetal k8s 1.19.10
Is this resolved? I encounter the same issue where I could not install anything because I do not know the sudo password. And why is the default username jovyan?
This issue is a combination of:
- Making the Helm chart start the user server with root
- Setting environment variables
- Starting the user server with a specific docker container that contains a secript, and that script doing whats needed.
In scope of fixing in this helm chart is part 1 and 2, and I believe that works.
Please verify that kubectl get pod --output=yaml <pod name of user that should have sudo>
does:
- [ ] include a reference to start as uid 0, see the pod and containers security context
- [ ] makes the container start the chosen script
- [ ] sets the environment variables correctly on the container
- [ ] verify if fsGid equivalent k8s configuration in the pod's securityContext is set (this may not be relevant, but not all storage supports this flag, among them is NFS based storage, even if it is set, it can be ignroed)
If all of that checks out, the issue isn't with this Helm chart, but the script presumably from using a jupyter/docker-stacks based container.
Most of the things on your list already check out:
This issue is a combination of:
1. Making the Helm chart start the user server with root
Jovyan user is granted sudo access as seen in logs during boot-up
2. Setting environment variables
These are part of the helm chart and I set GRANT_SUDO & NOTEBOOK_ARGS in the helm chart values
3. Starting the user server with a specific docker container that contains a secript, and that script doing whats needed.
I am using the sample image that has that script and from the logs we can see it triggers properly
In scope of fixing in this helm chart is part 1 and 2, and I believe that works.
Please verify that
kubectl get pod --output=yaml <pod name of user that should have sudo>
does:* [ ] include a reference to start as uid 0, see the pod and containers security context
Pod security context:
securityContext:
fsGroup: 0
Container security context:
securityContext:
privileged: true
runAsUser: 0
* [ ] makes the container start the chosen script
seen in the log outputs
* [ ] sets the environment variables correctly on the container
Guess these 3 are relevant and are set on the container:
- name: GRANT_SUDO
value: "yes"
- name: JUPYTERHUB_ADMIN_ACCESS
value: "1"
- name: NOTEBOOK_ARGS
value: --allow-root
* [ ] verify if fsGid equivalent k8s configuration in the pod's securityContext is set (this may not be relevant, but not all storage supports this flag, among them is NFS based storage, even if it is set, it can be ignroed)
I can see the fsGroup:0
in the pod security context. Is this enough, or should there be something else?
If all of that checks out, the issue isn't with this Helm chart, but the script presumably from using a jupyter/docker-stacks based container.
Interesting, we're seeing this too for about a month after upgrading z2jh and/or docker stacks versions (not sure which did it). I'm working to debug this now.
allowPrivilegeEscalation: false
did catch my eye:
securityContext:
allowPrivilegeEscalation: false
runAsUser: 0
apiVersion: v1
kind: Pod
metadata:
annotations:
hub.jupyter.org/username: seth
creationTimestamp: "2022-04-12T03:42:42Z"
labels:
app: jupyterhub
chart: jupyterhub-1.1.3-n354.h751bc313
component: singleuser-server
heritage: jupyterhub
hub.jupyter.org/network-access-hub: "true"
hub.jupyter.org/servername: ""
hub.jupyter.org/username: seth
release: improc
name: jupyter-seth
namespace: improc
resourceVersion: "18539122"
uid: 8f6e8406-0f5e-400f-b8c6-7f76a8ad8980
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: hub.jupyter.org/node-purpose
operator: In
values:
- user
weight: 100
automountServiceAccountToken: false
containers:
- args:
- start-singleuser.sh
env:
- name: CPU_GUARANTEE
value: "1.9"
- name: CPU_LIMIT
value: "8.0"
- name: GRANT_SUDO
value: "yes"
- name: JUPYERHUB_SINGLEUSER_APP
value: jupyter_server.serverapp.ServerApp
- name: JUPYTERHUB_ACTIVITY_URL
value: http://hub:8081/hub/api/users/seth/activity
- name: JUPYTERHUB_ADMIN_ACCESS
value: "1"
- name: JUPYTERHUB_API_URL
value: http://hub:8081/hub/api
- name: JUPYTERHUB_BASE_URL
value: /
- name: JUPYTERHUB_CLIENT_ID
value: jupyterhub-user-seth
- name: JUPYTERHUB_DEFAULT_URL
value: /lab
- name: JUPYTERHUB_HOST
- name: JUPYTERHUB_OAUTH_CALLBACK_URL
value: /user/seth/oauth_callback
- name: JUPYTERHUB_OAUTH_SCOPES
value: '["access:servers!server=seth/", "access:servers!user=seth"]'
- name: JUPYTERHUB_SERVER_NAME
- name: JUPYTERHUB_SERVICE_PREFIX
value: /user/seth/
- name: JUPYTERHUB_SERVICE_URL
value: http://0.0.0.0:8888/user/seth/
- name: JUPYTERHUB_USER
value: seth
- name: JUPYTER_IMAGE
value: gcr.io/ceres-imaging-science/improc-notebook:main
- name: JUPYTER_IMAGE_SPEC
value: gcr.io/ceres-imaging-science/improc-notebook:main
- name: MEM_GUARANTEE
value: "8589934592"
- name: MEM_LIMIT
value: "17179869184"
- name: NB_USER
value: seth
- name: NOTEBOOK_ARGS
value: --allow-root
image: gcr.io/ceres-imaging-science/improc-notebook:main
imagePullPolicy: Always
lifecycle: {}
name: notebook
ports:
- containerPort: 8888
name: notebook-port
protocol: TCP
resources:
limits:
cpu: "8"
memory: "17179869184"
requests:
cpu: 1900m
memory: "8589934592"
securityContext:
allowPrivilegeEscalation: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /home
name: home
- mountPath: /flights
name: ceres-flights
dnsPolicy: ClusterFirst
enableServiceLinks: true
initContainers:
- command:
- iptables
- -A
- OUTPUT
- -d
- 169.254.169.254
- -j
- DROP
image: jupyterhub/k8s-network-tools:1.1.3-n176.h739f4b47
imagePullPolicy: IfNotPresent
name: block-cloud-metadata
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
nodeName: gke-improc-cluster-general-use-c7937317-5j6v
preemptionPolicy: PreemptLowerPriority
priority: 0
priorityClassName: improc-default-priority
restartPolicy: OnFailure
schedulerName: improc-user-scheduler
securityContext:
fsGroup: 100
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: hub.jupyter.org/dedicated
operator: Equal
value: user
- effect: NoSchedule
key: hub.jupyter.org_dedicated
operator: Equal
value: user
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: home
persistentVolumeClaim:
claimName: icin-homedirs
- name: ceres-flights
persistentVolumeClaim:
claimName: ceres-flights
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-04-12T03:42:52Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2022-04-12T03:42:53Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2022-04-12T03:42:53Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2022-04-12T03:42:42Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://c9c6acb4cc30f58f146da24021ef0d176ded2ee105ea6477f64279128385f655
image: gcr.io/ceres-imaging-science/improc-notebook:main
imageID: gcr.io/ceres-imaging-science/improc-notebook@sha256:6150703d1308a963d367cd6316dfe48361bcc8e2f01134a192f6b767c47e9054
lastState: {}
name: notebook
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2022-04-12T03:42:53Z"
hostIP: 10.138.0.27
initContainerStatuses:
- containerID: containerd://15188753e872bf5d26cdce5efb4f39697cea74a03dd2e6526537522aeb7cbc1d
image: docker.io/jupyterhub/k8s-network-tools:1.1.3-n176.h739f4b47
imageID: docker.io/jupyterhub/k8s-network-tools@sha256:85f0e20cc9231808ce425916d0a1f5428e09d14a579c81b549e82538e87c1e4c
lastState: {}
name: block-cloud-metadata
ready: true
restartCount: 0
state:
terminated:
containerID: containerd://15188753e872bf5d26cdce5efb4f39697cea74a03dd2e6526537522aeb7cbc1d
exitCode: 0
finishedAt: "2022-04-12T03:42:51Z"
reason: Completed
startedAt: "2022-04-12T03:42:51Z"
phase: Running
podIP: 10.20.7.14
podIPs:
- ip: 10.20.7.14
qosClass: Burstable
startTime: "2022-04-12T03:42:42Z"
@consideRatio what would you think of a PR to z2jh that adds a singleUser.sudo = true|false (default: false)
value and then the z2jh chart that in turn triggers the chart adding the various bits (GRANT_SUDO env var, uid: 0, fsGid: 0, etc) to make this work?
My thinking behind making sudo-on-singleuser a 'single flag feature' would be:
- This is probably a pretty common feature, but its very hard to configure atm without fairly deep juphub + kubespawner + k8s knowledge. Getting sudo access to the pods is currently a pretty k8s knowledgeable operation atm, because there's enough values you have to get right that its pretty easy to mess up following a step-by-step, and if/when you do mess up, the layers you have to debug thru are fairly deep. I suspect a fair percentage of scientist-user clusters would prefer to enable sudo? Templating is more accurately executed by computers than sysadmins haha.
- A single value may be easier to support over long term upgrades than a shifting variety of flags that accomplish the sudo on singleuser goal: In our case, having this be several bits and bobbles you add to your own z2jh config has meant our
sudo
has broken a couple times over the years as the "necessary bits and bobbles list" has shifted. So from a chart maintenance and long term compatibility standpoint, I think a single value that activates the lower level flags/env/etc might be easier to support! Which in turn might reduce upstream noise and support demand for sudo assist (which prolly hasn't been that high in reality or this would already be a feature grin, but still).
OK! I think my initial intuition is bearing out, you probably need the singleuser image launched to be launched (by kubespawner, see: ) with allowPrivilegeEscalation: true
in order to use sudo
on z2jh spawner notebooks.
Relevant kubespawner docs are https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html#kubespawner.KubeSpawner.allow_privilege_escalation :
Which explicitly states: "When set to False (the default), the primary user visible effect is that setuid binaries (like sudo) will no longer work."
For those (like me) using an older kubespawner, which was not trying to protect (wisely by default!) against malicious users escaping their k8s jail, this is a change we'll need to make to get our juphubs working with sudo again.
@Atharex any chance you're still working on this and can double verify my fix?
Currently I don't think there's a way to set the kubespawner allow_privilege_escalation
jupyter config trait from the z2jh chart without overriding the config, I've created a PR to permit this: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/2663
For anyone else bumping into this who's gone through @consideRatio 's checklist and still is having problems, try adding this to your values.yaml:
singleuser:
extraConfig:
allowPrivilegeEscalationForSudo: |
# unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
c.KubeSpawner.allow_privilege_escalation = True
For us, this was the key to re-enable sudo on a current z2jh chart (which has a current kubespawner, which I suspect is where this breakage appearing originated)
For a full "fixed" example of @Atharex 's original question, I believe the following (adding KubeSpaner.allow_privilege_escalation = True
) will work:
singleuser:
extraEnv:
GRANT_SUDO: "yes"
NOTEBOOK_ARGS: "--allow-root"
uid: 0
fsGid: 0
cmd: start-singleuser.sh
extraConfig:
allowPrivilegeEscalationForSudo: |
# unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
c.KubeSpawner.allow_privilege_escalation = True
I haven't read all this and is a bit too busy to focus on this right now, but please note that the kubespawner 3 release included changes of relevance to this. Check out its changelog! I hope to get time to focus on this soon, but not this moment =/
@consideRatio exactly haha: kubespawner 3 introduced the change that broke our sudo, and guessing other folks in this thread too. Here's the commit to kubespawner by @yuvipanda : https://github.com/jupyterhub/kubespawner/commit/4258795617f539229ecda2a12ce1600891b5bdf9 which explicitly calls out what needs to be changed to use sudo with kubespawner3 (same as I discovered above: you now need c.KubeSpawner.allow_privilege_escalation = True
to use sudo ).
I agree this is a much better default for most installs where they can't 100% trust users, but we may need to make "hey, you need to update this if you wanna keep using sudo with kubespawner3" more discoverable for z2jh users (i actually hadn't found this commit until now and figured this out by digging thru code), since I suspect many folks aren't even aware of kubespawner details, let alone knowing when its upgraded automatically by their chart and reading changelogs grin
If/when you have time to process this issue, I think z2jh could use a small update to support the new required-for-sudo allow_privilege_escalation
param directly, and any "here's how to enable sudo" docs (if any) also will need updating. I'd be happy to do the legwork on a PR, but i know that even review bandwidth can be very limited, so no arm twisting intended!!! I appreciate all the open source work you find the time for, whenever you find the time ππ½ππ½ππ½
I think fixing sudo for everyone using z2jh + kubespawner3 is very easy now that we know what the issue is, from @yuvipanda 's commit to kubespawner 3:
Default allow_privilege_escalation to False
Allows it to be set to None as well, to not set the property.
This is a breaking change for hubs where admins were granting
sudo rights to users. That already required some extra work,
so this would be an additional propety to set for that. The
added security benefit from this much more secure default is
well worth the breakage IMO.
For anyone else bumping into this who's gone through @consideRatio 's checklist and still is having problems, try adding this to your values.yaml:
singleuser: extraConfig: allowPrivilegeEscalationForSudo: | # unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429 c.KubeSpawner.allow_privilege_escalation = True
For us, this was the key to re-enable sudo on a current z2jh chart (which has a current kubespawner, which I suspect is where this breakage appearing originated)
For a full "fixed" example of @Atharex 's original question, I believe the following (adding
KubeSpaner.allow_privilege_escalation = True
) will work:singleuser: extraEnv: GRANT_SUDO: "yes" NOTEBOOK_ARGS: "--allow-root" uid: 0 fsGid: 0 cmd: start-singleuser.sh extraConfig: allowPrivilegeEscalationForSudo: | # unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429 c.KubeSpawner.allow_privilege_escalation = True
Tried 100 things, this worked for me:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: jupyter-hub
namespace: system
spec:
chart:
spec:
chart: jupyterhub
sourceRef:
kind: HelmRepository
name: jupyter-hub
namespace: system
version: '1.1.3-n350.h849ece98'
interval: 1m0s
values:
hub:
db:
pvc:
storageClassName: local
networkPolicy:
enabled: false
# @note [https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429]
extraConfig:
allowPrivilegeEscalationForSudo: |
# unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
c.KubeSpawner.allow_privilege_escalation = True
proxy:
service:
type: ClusterIP
chp:
networkPolicy:
enabled: false
singleuser:
image:
name: jupyter/all-spark-notebook
tag: spark-3.2.1
# @note [https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/562]
extraEnv:
GRANT_SUDO: "yes"
NOTEBOOK_ARGS: "--allow-root"
uid: 0
fsGid: 0
storage:
extraVolumes: []
extraVolumeMounts: []
dynamic:
storageClass: local
I am facing similar issues in a single-node microk8s. I have no problems to run JupyterHub but after loging the spawner is at pending state. If I get the logs of the pod: microk8s kubectl logs jupyter-username hereΒ΄s the response Defaulted container "notebook" out of: notebook, block-cloud-metadata (init)
Can anyone please help?
@bilbomaticaeugis did you solve this? I'm running into the same problem and cant find a a place to look into to see the root cause.
@bilbomaticaeugis @portega-inbrain I am unfortunately facing the same issue, the jupyter-admin pod doesn't come up when I initally log in to spawn the server. Please can anyone help?
@amrap030 @bilbomaticaeugis @portega-inbrain I have the same problem, is there any news on this?
RUN echo "jovyan ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers && \
usermod -aG sudo jovyan && \
usermod -aG root jovyan
I'm doing this in mine additional to "allowPrivilegeEscalation" to enable sudo in stacks