zero-to-jupyterhub-k8s icon indicating copy to clipboard operation
zero-to-jupyterhub-k8s copied to clipboard

Someonething along the way goes wrong - jupyter/docker-stacks scripts didn't grant sudo permissions

Open Atharex opened this issue 3 years ago β€’ 17 comments

Bug description

Trying out sudo commands in the jupyterlab session fails with the following message sudo: effective uid is not 0, is /usr/bin/sudo on a filesystem with the 'nosuid' option set or an NFS file system without root privileges?

I am using the datascience-notebook from dockerhub as a profile:

singleuser:
  image:
    name: jupyterhub/k8s-singleuser-sample
    tag: '1.1.3'
    pullPolicy: IfNotPresent
  profileList:

    - display_name: "Default"
      description: "Default"
      kubespawner_override:
        image: jupyter/datascience-notebook

I've followed other issues where the configuration changes helped, so I've reused them in my setup as follows:

singleuser:
  extraEnv:
    GRANT_SUDO: "yes"
    NOTEBOOK_ARGS: "--allow-root"
  uid: 0
  fsGid: 0
  cmd: start-singleuser.sh

This configuration takes effect during server boot up, as can be seen from the logs:

Set username to: jovyan
usermod: no changes
Granting jovyan sudo access and appending /opt/conda/bin to sudo PATH
Executing the command: jupyterhub-singleuser --allow-root --ip=0.0.0.0 --port=8888 --SingleUserNotebookApp.default_url=/lab
[W 2021-10-07 13:02:00.826 SingleUserNotebookApp configurable:193] Config option `open_browser` not recognized by `SingleUserNotebookApp`.  Did you mean `browser`?
[I 2021-10-07 13:02:00.839 SingleUserNotebookApp notebookapp:1593] Authentication of /metrics is OFF, since other authentication is disabled.
[W 2021-10-07 13:02:01.453 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-10-07 13:02:01.454 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-10-07 13:02:01.454 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-10-07 13:02:01.454 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2021-10-07 13:02:01.463 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.9/site-packages/jupyterlab
...

Inside the session I also check for the presence of sudoers configuration and it is in place:

(base) jovyan@jupyter-lab:~$ ls /etc/sudoers.d
notebook  path  README
(base) jovyan@jupyter-lab:~$ cat /etc/sudoers.d/notebook
jovyan ALL=(ALL) NOPASSWD:ALL
(base) jovyan@jupyter-lab:~$ ls -la /etc/sudoers.d/notebook 
-rw-r--r-- 1 root root 30 Oct  7 13:02 /etc/sudoers.d/notebook
(base) jovyan@jupyter-lab:~$ cat /etc/sudoers.d/path
Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/opt/conda/bin"
(base) jovyan@jupyter-lab:~$ id
uid=1000(jovyan) gid=100(users) groups=100(users)

But nevertheless all sudo commands fail with the same error message

(base) jovyan@jupyter-lab:~$ sudo -L
sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?

Expected behaviour

sudo working correctly with all commands

Actual behaviour

sudo does not work with the given configuration

Your personal set up

Running ZTJH helm chart v1.1.3 on baremetal k8s 1.19.10

Atharex avatar Oct 07 '21 13:10 Atharex

Is this resolved? I encounter the same issue where I could not install anything because I do not know the sudo password. And why is the default username jovyan?

obgeneralao avatar Oct 30 '21 12:10 obgeneralao

This issue is a combination of:

  1. Making the Helm chart start the user server with root
  2. Setting environment variables
  3. Starting the user server with a specific docker container that contains a secript, and that script doing whats needed.

In scope of fixing in this helm chart is part 1 and 2, and I believe that works.

Please verify that kubectl get pod --output=yaml <pod name of user that should have sudo> does:

  • [ ] include a reference to start as uid 0, see the pod and containers security context
  • [ ] makes the container start the chosen script
  • [ ] sets the environment variables correctly on the container
  • [ ] verify if fsGid equivalent k8s configuration in the pod's securityContext is set (this may not be relevant, but not all storage supports this flag, among them is NFS based storage, even if it is set, it can be ignroed)

If all of that checks out, the issue isn't with this Helm chart, but the script presumably from using a jupyter/docker-stacks based container.

consideRatio avatar Oct 30 '21 14:10 consideRatio

Most of the things on your list already check out:

This issue is a combination of:

1. Making the Helm chart start the user server with root

Jovyan user is granted sudo access as seen in logs during boot-up

2. Setting environment variables

These are part of the helm chart and I set GRANT_SUDO & NOTEBOOK_ARGS in the helm chart values

3. Starting the user server with a specific docker container that contains a secript, and that script doing whats needed.

I am using the sample image that has that script and from the logs we can see it triggers properly

In scope of fixing in this helm chart is part 1 and 2, and I believe that works.

Please verify that kubectl get pod --output=yaml <pod name of user that should have sudo> does:

* [ ]  include a reference to start as uid 0, see the pod and containers security context

Pod security context: securityContext: fsGroup: 0

Container security context: securityContext: privileged: true runAsUser: 0

* [ ]  makes the container start the chosen script

seen in the log outputs

* [ ]  sets the environment variables correctly on the container

Guess these 3 are relevant and are set on the container:

    - name: GRANT_SUDO
      value: "yes"
    - name: JUPYTERHUB_ADMIN_ACCESS
      value: "1"
    - name: NOTEBOOK_ARGS
      value: --allow-root
* [ ]  verify if fsGid equivalent k8s configuration in the pod's securityContext is set (this may not be relevant, but not all storage supports this flag, among them is NFS based storage, even if it is set, it can be ignroed)

I can see the fsGroup:0 in the pod security context. Is this enough, or should there be something else?

If all of that checks out, the issue isn't with this Helm chart, but the script presumably from using a jupyter/docker-stacks based container.

Atharex avatar Nov 16 '21 07:11 Atharex

Interesting, we're seeing this too for about a month after upgrading z2jh and/or docker stacks versions (not sure which did it). I'm working to debug this now.

allowPrivilegeEscalation: false did catch my eye:

    securityContext:
      allowPrivilegeEscalation: false
      runAsUser: 0
apiVersion: v1
kind: Pod
metadata:
  annotations:
    hub.jupyter.org/username: seth
  creationTimestamp: "2022-04-12T03:42:42Z"
  labels:
    app: jupyterhub
    chart: jupyterhub-1.1.3-n354.h751bc313
    component: singleuser-server
    heritage: jupyterhub
    hub.jupyter.org/network-access-hub: "true"
    hub.jupyter.org/servername: ""
    hub.jupyter.org/username: seth
    release: improc
  name: jupyter-seth
  namespace: improc
  resourceVersion: "18539122"
  uid: 8f6e8406-0f5e-400f-b8c6-7f76a8ad8980
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: hub.jupyter.org/node-purpose
            operator: In
            values:
            - user
        weight: 100
  automountServiceAccountToken: false
  containers:
  - args:
    - start-singleuser.sh
    env:
    - name: CPU_GUARANTEE
      value: "1.9"
    - name: CPU_LIMIT
      value: "8.0"
    - name: GRANT_SUDO
      value: "yes"
    - name: JUPYERHUB_SINGLEUSER_APP
      value: jupyter_server.serverapp.ServerApp
    - name: JUPYTERHUB_ACTIVITY_URL
      value: http://hub:8081/hub/api/users/seth/activity
    - name: JUPYTERHUB_ADMIN_ACCESS
      value: "1"
    - name: JUPYTERHUB_API_URL
      value: http://hub:8081/hub/api
    - name: JUPYTERHUB_BASE_URL
      value: /
    - name: JUPYTERHUB_CLIENT_ID
      value: jupyterhub-user-seth
    - name: JUPYTERHUB_DEFAULT_URL
      value: /lab
    - name: JUPYTERHUB_HOST
    - name: JUPYTERHUB_OAUTH_CALLBACK_URL
      value: /user/seth/oauth_callback
    - name: JUPYTERHUB_OAUTH_SCOPES
      value: '["access:servers!server=seth/", "access:servers!user=seth"]'
    - name: JUPYTERHUB_SERVER_NAME
    - name: JUPYTERHUB_SERVICE_PREFIX
      value: /user/seth/
    - name: JUPYTERHUB_SERVICE_URL
      value: http://0.0.0.0:8888/user/seth/
    - name: JUPYTERHUB_USER
      value: seth
    - name: JUPYTER_IMAGE
      value: gcr.io/ceres-imaging-science/improc-notebook:main
    - name: JUPYTER_IMAGE_SPEC
      value: gcr.io/ceres-imaging-science/improc-notebook:main
    - name: MEM_GUARANTEE
      value: "8589934592"
    - name: MEM_LIMIT
      value: "17179869184"
    - name: NB_USER
      value: seth
    - name: NOTEBOOK_ARGS
      value: --allow-root
    image: gcr.io/ceres-imaging-science/improc-notebook:main
    imagePullPolicy: Always
    lifecycle: {}
    name: notebook
    ports:
    - containerPort: 8888
      name: notebook-port
      protocol: TCP
    resources:
      limits:
        cpu: "8"
        memory: "17179869184"
      requests:
        cpu: 1900m
        memory: "8589934592"
    securityContext:
      allowPrivilegeEscalation: false
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /home
      name: home
    - mountPath: /flights
      name: ceres-flights
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - command:
    - iptables
    - -A
    - OUTPUT
    - -d
    - 169.254.169.254
    - -j
    - DROP
    image: jupyterhub/k8s-network-tools:1.1.3-n176.h739f4b47
    imagePullPolicy: IfNotPresent
    name: block-cloud-metadata
    resources: {}
    securityContext:
      capabilities:
        add:
        - NET_ADMIN
      privileged: true
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
  nodeName: gke-improc-cluster-general-use-c7937317-5j6v
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  priorityClassName: improc-default-priority
  restartPolicy: OnFailure
  schedulerName: improc-user-scheduler
  securityContext:
    fsGroup: 100
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: hub.jupyter.org/dedicated
    operator: Equal
    value: user
  - effect: NoSchedule
    key: hub.jupyter.org_dedicated
    operator: Equal
    value: user
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: home
    persistentVolumeClaim:
      claimName: icin-homedirs
  - name: ceres-flights
    persistentVolumeClaim:
      claimName: ceres-flights
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-04-12T03:42:52Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-04-12T03:42:53Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-04-12T03:42:53Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-04-12T03:42:42Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://c9c6acb4cc30f58f146da24021ef0d176ded2ee105ea6477f64279128385f655
    image: gcr.io/ceres-imaging-science/improc-notebook:main
    imageID: gcr.io/ceres-imaging-science/improc-notebook@sha256:6150703d1308a963d367cd6316dfe48361bcc8e2f01134a192f6b767c47e9054
    lastState: {}
    name: notebook
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-04-12T03:42:53Z"
  hostIP: 10.138.0.27
  initContainerStatuses:
  - containerID: containerd://15188753e872bf5d26cdce5efb4f39697cea74a03dd2e6526537522aeb7cbc1d
    image: docker.io/jupyterhub/k8s-network-tools:1.1.3-n176.h739f4b47
    imageID: docker.io/jupyterhub/k8s-network-tools@sha256:85f0e20cc9231808ce425916d0a1f5428e09d14a579c81b549e82538e87c1e4c
    lastState: {}
    name: block-cloud-metadata
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: containerd://15188753e872bf5d26cdce5efb4f39697cea74a03dd2e6526537522aeb7cbc1d
        exitCode: 0
        finishedAt: "2022-04-12T03:42:51Z"
        reason: Completed
        startedAt: "2022-04-12T03:42:51Z"
  phase: Running
  podIP: 10.20.7.14
  podIPs:
  - ip: 10.20.7.14
  qosClass: Burstable
  startTime: "2022-04-12T03:42:42Z"

snickell avatar Apr 12 '22 04:04 snickell

@consideRatio what would you think of a PR to z2jh that adds a singleUser.sudo = true|false (default: false) value and then the z2jh chart that in turn triggers the chart adding the various bits (GRANT_SUDO env var, uid: 0, fsGid: 0, etc) to make this work?

My thinking behind making sudo-on-singleuser a 'single flag feature' would be:

  1. This is probably a pretty common feature, but its very hard to configure atm without fairly deep juphub + kubespawner + k8s knowledge. Getting sudo access to the pods is currently a pretty k8s knowledgeable operation atm, because there's enough values you have to get right that its pretty easy to mess up following a step-by-step, and if/when you do mess up, the layers you have to debug thru are fairly deep. I suspect a fair percentage of scientist-user clusters would prefer to enable sudo? Templating is more accurately executed by computers than sysadmins haha.
  2. A single value may be easier to support over long term upgrades than a shifting variety of flags that accomplish the sudo on singleuser goal: In our case, having this be several bits and bobbles you add to your own z2jh config has meant our sudo has broken a couple times over the years as the "necessary bits and bobbles list" has shifted. So from a chart maintenance and long term compatibility standpoint, I think a single value that activates the lower level flags/env/etc might be easier to support! Which in turn might reduce upstream noise and support demand for sudo assist (which prolly hasn't been that high in reality or this would already be a feature grin, but still).

snickell avatar Apr 12 '22 04:04 snickell

OK! I think my initial intuition is bearing out, you probably need the singleuser image launched to be launched (by kubespawner, see: ) with allowPrivilegeEscalation: true in order to use sudo on z2jh spawner notebooks.

Relevant kubespawner docs are https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html#kubespawner.KubeSpawner.allow_privilege_escalation : image

Which explicitly states: "When set to False (the default), the primary user visible effect is that setuid binaries (like sudo) will no longer work."

For those (like me) using an older kubespawner, which was not trying to protect (wisely by default!) against malicious users escaping their k8s jail, this is a change we'll need to make to get our juphubs working with sudo again.

snickell avatar Apr 12 '22 05:04 snickell

@Atharex any chance you're still working on this and can double verify my fix?

snickell avatar Apr 12 '22 05:04 snickell

Currently I don't think there's a way to set the kubespawner allow_privilege_escalation jupyter config trait from the z2jh chart without overriding the config, I've created a PR to permit this: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/2663

snickell avatar Apr 12 '22 05:04 snickell

For anyone else bumping into this who's gone through @consideRatio 's checklist and still is having problems, try adding this to your values.yaml:

singleuser:
  extraConfig:
    allowPrivilegeEscalationForSudo: |
      # unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
      c.KubeSpawner.allow_privilege_escalation = True 

For us, this was the key to re-enable sudo on a current z2jh chart (which has a current kubespawner, which I suspect is where this breakage appearing originated)

For a full "fixed" example of @Atharex 's original question, I believe the following (adding KubeSpaner.allow_privilege_escalation = True) will work:

singleuser:
  extraEnv:
    GRANT_SUDO: "yes"
    NOTEBOOK_ARGS: "--allow-root"
  uid: 0
  fsGid: 0
  cmd: start-singleuser.sh
  extraConfig:
    allowPrivilegeEscalationForSudo: |
      # unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
      c.KubeSpawner.allow_privilege_escalation = True 

snickell avatar Apr 12 '22 06:04 snickell

I haven't read all this and is a bit too busy to focus on this right now, but please note that the kubespawner 3 release included changes of relevance to this. Check out its changelog! I hope to get time to focus on this soon, but not this moment =/

consideRatio avatar Apr 12 '22 15:04 consideRatio

@consideRatio exactly haha: kubespawner 3 introduced the change that broke our sudo, and guessing other folks in this thread too. Here's the commit to kubespawner by @yuvipanda : https://github.com/jupyterhub/kubespawner/commit/4258795617f539229ecda2a12ce1600891b5bdf9 which explicitly calls out what needs to be changed to use sudo with kubespawner3 (same as I discovered above: you now need c.KubeSpawner.allow_privilege_escalation = True to use sudo ).

I agree this is a much better default for most installs where they can't 100% trust users, but we may need to make "hey, you need to update this if you wanna keep using sudo with kubespawner3" more discoverable for z2jh users (i actually hadn't found this commit until now and figured this out by digging thru code), since I suspect many folks aren't even aware of kubespawner details, let alone knowing when its upgraded automatically by their chart and reading changelogs grin

If/when you have time to process this issue, I think z2jh could use a small update to support the new required-for-sudo allow_privilege_escalation param directly, and any "here's how to enable sudo" docs (if any) also will need updating. I'd be happy to do the legwork on a PR, but i know that even review bandwidth can be very limited, so no arm twisting intended!!! I appreciate all the open source work you find the time for, whenever you find the time πŸ™πŸ½πŸ™πŸ½πŸ™πŸ½

I think fixing sudo for everyone using z2jh + kubespawner3 is very easy now that we know what the issue is, from @yuvipanda 's commit to kubespawner 3:

Default allow_privilege_escalation to False
Allows it to be set to None as well, to not set the property.

This is a breaking change for hubs where admins were granting
sudo rights to users. That already required some extra work,
so this would be an additional propety to set for that. The
added security benefit from this much more secure default is
well worth the breakage IMO.

snickell avatar Apr 15 '22 15:04 snickell

For anyone else bumping into this who's gone through @consideRatio 's checklist and still is having problems, try adding this to your values.yaml:

singleuser:
  extraConfig:
    allowPrivilegeEscalationForSudo: |
      # unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
      c.KubeSpawner.allow_privilege_escalation = True 

For us, this was the key to re-enable sudo on a current z2jh chart (which has a current kubespawner, which I suspect is where this breakage appearing originated)

For a full "fixed" example of @Atharex 's original question, I believe the following (adding KubeSpaner.allow_privilege_escalation = True) will work:

singleuser:
  extraEnv:
    GRANT_SUDO: "yes"
    NOTEBOOK_ARGS: "--allow-root"
  uid: 0
  fsGid: 0
  cmd: start-singleuser.sh
  extraConfig:
    allowPrivilegeEscalationForSudo: |
      # unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
      c.KubeSpawner.allow_privilege_escalation = True 

Tried 100 things, this worked for me:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: jupyter-hub
  namespace: system
spec:
  chart:
    spec:
      chart: jupyterhub
      sourceRef:
        kind: HelmRepository
        name: jupyter-hub
        namespace: system
      version: '1.1.3-n350.h849ece98'
  interval: 1m0s
  values:
    hub:
      db:
        pvc:
          storageClassName: local
      networkPolicy:
        enabled: false
      # @note [https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429]
      extraConfig:
        allowPrivilegeEscalationForSudo: |
          # unbreak sudo, see: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2429
          c.KubeSpawner.allow_privilege_escalation = True
    proxy:
      service:
        type: ClusterIP
      chp:
        networkPolicy:
          enabled: false
    singleuser:
      image:
        name: jupyter/all-spark-notebook
        tag: spark-3.2.1
      # @note [https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/562]
      extraEnv:
        GRANT_SUDO: "yes"
        NOTEBOOK_ARGS: "--allow-root"
      uid: 0
      fsGid: 0
      storage:
        extraVolumes: []
        extraVolumeMounts: []
        dynamic:
          storageClass: local

zzvara avatar Jun 24 '22 17:06 zzvara

I am facing similar issues in a single-node microk8s. I have no problems to run JupyterHub but after loging the spawner is at pending state. If I get the logs of the pod: microk8s kubectl logs jupyter-username hereΒ΄s the response Defaulted container "notebook" out of: notebook, block-cloud-metadata (init)

Can anyone please help?

oskaresparza avatar Oct 27 '22 07:10 oskaresparza

@bilbomaticaeugis did you solve this? I'm running into the same problem and cant find a a place to look into to see the root cause.

portega-inbrain avatar Dec 09 '22 15:12 portega-inbrain

@bilbomaticaeugis @portega-inbrain I am unfortunately facing the same issue, the jupyter-admin pod doesn't come up when I initally log in to spawn the server. Please can anyone help?

amrap030 avatar Jun 28 '23 20:06 amrap030

@amrap030 @bilbomaticaeugis @portega-inbrain I have the same problem, is there any news on this?

beorostica avatar Jul 19 '23 18:07 beorostica

RUN echo "jovyan ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers && \
    usermod -aG sudo jovyan && \
    usermod -aG root jovyan

I'm doing this in mine additional to "allowPrivilegeEscalation" to enable sudo in stacks

dimm0 avatar Sep 02 '23 05:09 dimm0