zero-to-jupyterhub-k8s icon indicating copy to clipboard operation
zero-to-jupyterhub-k8s copied to clipboard

Docs: overview of the chart components down to the Pod level

Open consideRatio opened this issue 5 years ago • 3 comments
trafficstars

@NerdSec raised a good point about providing some overview of the components would be good for the documentation, and I agree. I think it would make sense to make this overview while also describing details on the k8s level - what pods do what etc. I'm providing a quick overview to help anyone get started providing such overview in the docs.

Boilerplate overview

  • hub pod - where JupyterHub runs, which also interface with a Python Spawner class (KubeSpawner) and a Python Proxy class (ConfigurableHTTPProxy). The KubeSpawner Python class speak with the k8s api-server (REST API) to create user pods etc, and the ConfigurableHTTPProxy Python classes speak with the proxy pod (REST API) that runs the actual ConfigurableHTTPProxy NodeJS application.
  • proxy pod - where a jupyterhub/ConfigurableHTTPProxy (NodeJS) server run, which will be the target for all traffic incoming from the internet, and routing traffic to the hub k8s Service and indirectly the hub pod by default, but can beconfigured to send routes like /user/my-username towards a user server pod if JupyterHub has configured such network route through through the servers REST API.
  • user-placeholder pod(s): Seat warmers for real users to ensure the k8s cluster autoscales and adds nodes ahead of time. Will be evicted in favor of real users whenever needed for a real user to fit on a node in the k8s cluster.
  • user-scheduler pod(s): Are responsible for acting as a k8s scheduler of the user pods and user-placeholder pods. The default k8s pod scheduler would spread out pods on different nodes, but the user-scheduler instead packs them on the most busy node by default, which is wanted as it helps the k8s cluster scale down nodes that only can get freed up properly if new users don't end up on the least busy node.
  • autohttps pod: a Traefik server that when enabled will position itself just in front of the proxy pod to acquire a HTTPS certificate from Let's Encrypt, and following that decrypt/encrypt all traffic that goes in/out from the internet.
  • hook-image-puller and hook-image-awaiter pods: These pods are temporarily created when a helm install or helm upgrade is executed, their purpose is to ensure the k8s nodes populate their caches with the docker images that we are about to start referencing following the chart is upgraded. The hook-image-awaiter pod's job is to halt the rest of the upgrade until the hook-image-puller pods have downloaded the docker images.
  • continuous-image-puller pods: These pods complement the hook-image-puller pods by ensure that nodes that are dynamically added to the cluster get's the relevant images pulled to them.

consideRatio avatar Oct 05 '20 07:10 consideRatio

This is quite detailed and clears up the architecture for me! Thanks @consideRatio. Will open a draft PR describing the changes soon. I will work on creating this as a block diagram and also add a security section, but I will probably wait for #1798 to be merged. This change is really exciting!

Overall I think the following items need to be added:

  • Create a Architecture diagram of Z2JH
    • Provide background of JupyterHub components
    • Explain the k8s components that facilitate JH in k8s
    • Add pointers for infra specific tasks
  • Add a Security section
    • Explain the general best practices implemented in z2jh
    • Document how to override or change these parameters with appropriate warnings

Finally, this is more of a UX change, where I think it makes sense to switch the search bar and header positions, and make the topics easy to navigate with collapsible menus to enable quick movement in the docs.

Please feel free to add additional points that you would like for me to cover!

nachiket-lab avatar Oct 05 '20 09:10 nachiket-lab

@NerdSec I love it! :D :tada:

There is a security section already nested under the administrators guide.

consideRatio avatar Oct 05 '20 10:10 consideRatio

@consideRatio So, I have managed to add some form of structure to the docs. Have clubbed everything under four sections, and will now work on them individually.

Have fixed the links and added relative paths for images and reference to ensure things don't break in future. A version of the docs is hosted here.

Repo: https://github.com/NerdSec/z2jh-docs

nachiket-lab avatar Oct 07 '20 06:10 nachiket-lab