spin-operator icon indicating copy to clipboard operation
spin-operator copied to clipboard

SpinApp: Support nodeSelector config

Open vdice opened this issue 1 year ago • 9 comments

Feature request to support sending in user-defined nodeSelector config to the pod(s) representing a given SpinApp.

vdice avatar Apr 24 '24 22:04 vdice

We already somewhat support that (and require it in most production deployments) via RuntimeClasses - I'd love to hear more about where you envision wanting those and why on an app/executor level?

endocrimes avatar Apr 24 '24 23:04 endocrimes

Selection via RuntimeClass is probably sufficient for my general use case, which is having control over which subset of nodes in a cluster SpinApps can run on. I don't currently need a finer-grained scheduling strategy than that, so unless others chime in, we can use the RuntimeClass approach for now.

vdice avatar Apr 24 '24 23:04 vdice

I think we are still missing the ability to ensure Spin pods are only scheduled to nodes that were provisioned by the runtime class manager. For example, in my setup, i have a 3 node cluster. I've only annotated the one named aks-apps-59045239-vmss000000 (kubectl annotate node --all runtime=containerd-shim-spin), so only that one is provisioned with the Spin shim.

However, spin apps were still scheduled there and left in ContainerCreating state:

density-0-4-7fc4dd8f78-862j6         1/1     Running             0          4m51s   10.10.2.183   aks-apps-59045239-vmss000000     <none>           <none>
density-0-5-6f8c9f9855-mqkqr         1/1     Running             0          4m51s   10.10.2.128   aks-apps-59045239-vmss000000     <none>           <none>
density-0-6-7fcb448876-7l5f2         1/1     Running             0          4m51s   10.10.2.170   aks-apps-59045239-vmss000000     <none>           <none>
density-0-7-5f6775cc4f-cdvv8         0/1     ContainerCreating   0          4m51s   <none>        aks-system-15560169-vmss000000   <none>           <none>
density-0-8-54676f889d-pxh8s         0/1     ContainerCreating   0          4m51s   <none>        aks-system-15560169-vmss000000   <none>           <none>
density-0-9-698557c7d8-l64sn         1/1     Running             0          4m51s   10.10.2.214   aks-apps-59045239-vmss000000     <none>           <none>

Which is due to the shim not existing on that node:

Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                  From               Message
  ----     ------                  ----                 ----               -------
  Normal   Scheduled               4m39s                default-scheduler  Successfully assigned default/density-0-7-5f6775cc4f-cdvv8 to aks-system-15560169-vmss000000
  Warning  FailedCreatePodSandBox  2s (x22 over 4m39s)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox runtime: no runtime for "spin" is configured

kate-goldenring avatar Apr 26 '24 21:04 kate-goldenring

It looks like we need to make some more considerations with scheduling with RuntimeClass: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/585-runtime-class/README.md#runtimeclass-scheduling

kate-goldenring avatar Apr 26 '24 21:04 kate-goldenring

I just noticed the call out on this in the SpinKube installation docs. Thanks @endocrimes

kate-goldenring avatar Apr 26 '24 21:04 kate-goldenring

Assuming you provisioned the shim with kwasm/runtime-class-manager, the following works well:

  1. Label the provisioned nodes:
kubectl annotate node -l runtime=containerd-shim-spin kwasm.sh/kwasm-node=true
  1. Apply the runtime class with the nodeSelector for the label:
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: wasmtime-spin-v2
handler: spin
scheduling:
  nodeSelector:
    runtime: containerd-shim-spin

kate-goldenring avatar Apr 26 '24 21:04 kate-goldenring

Doing this via the RuntimeClass feels a lot like an anti pattern.

Its basically saying the user would be editing the RuntimeClass to set this. For anything else they are running this wont be the case.

Setting via the SpinApp CRD makes a lot more sense.

chokosabe avatar Aug 28 '24 20:08 chokosabe

@chokosabe We already have to have a runtime class in SpinKube (for the CRI to correctly detect that a shim should be used) - so we also rely on the runtimeClass expressing that "these nodes meet this class" (I've done the same with gvisor in the past, but it's as-designed).

It feels a bit awkward the first time you come across it (esp if you haven't really used RuntimeClasses for things that don't come with your k8s distribution / or on bare metal) but it has a nice benefit of not having to duplicate cluster-specific scheduling constraints across apps for portability, when already saying "i have this dependency".

In the future we should probably handle a lot of that for you in the runtime-class-manager (which would be super cool for also being able to say "at least version X of the shim" at the same time 😅 ).


More generally we probably do want to also support more fine-grained scheduling options in the operator in the future tho (e.g if the shim gets support for GPUs), but I was hesitant to add it to cover things that should be handled with the RuntimeClass

endocrimes avatar Aug 28 '24 21:08 endocrimes

I think it is worth considering adding node selectors to the SpinApp. I think we could see them as additive, such that node selectors could be added to the runtime class and the SpinApp and we'd define some precedence

kate-goldenring avatar Aug 28 '24 22:08 kate-goldenring