torchx icon indicating copy to clipboard operation
torchx copied to clipboard

Add node selector to KubernetesScheduler run opts

Open JackWittmayer opened this issue 6 months ago • 1 comments

Description

Allow Torchx KubernetesScheduler users to specify a node selector that their volcano jobs would schedule pods to.

Motivation/Background

Currently, users can only specify which machines they'd like to run on based on resources or the node.kubernetes.io/instance-type label. Having a node selector would allow them to submit jobs to specific machines in any way they want, which enables use cases like testing isolated machines, running consecutive jobs on the same machine for comparison, and segmenting the k8s cluster by label.

Detailed Proposal

Add node_selector as a run-opt to the KubernetesScheduler run_opts, KubernetesOpts and other entry points. Add user-specified node_selector to role_to_pod method.

Alternatives

Extend the resource.capabilities feature to include other labels. This solution is less desirable because hard-coded label names will always be limiting.

Additional context/links

Code linked above. Documentation: https://docs.pytorch.org/torchx/main/schedulers/kubernetes.html

JackWittmayer avatar May 19 '25 16:05 JackWittmayer

Do you have any objectsion, @kiukchung , @d4l3k, @tonykao8080, @andywag ?

clumsy avatar May 21 '25 14:05 clumsy

Just want to check if there are any concerns - otherwise I can contribute @kiukchung , @d4l3k, @tonykao8080, @andywag

clumsy avatar Oct 15 '25 18:10 clumsy

Just want to check if there are any concerns - otherwise I can contribute @kiukchung , @d4l3k, @tonykao8080, @andywag

Given the related issue: https://github.com/meta-pytorch/torchx/issues/1068 might make more sense to allow pod specs given as yaml or json to be "overlayed" on top of the specs that get generated in the kubernetes_scheduler.py from the appdef.

this way we don't have to keep adding runopts and users can unblock themselves from adding additional custom configs.

Fwiw we do this internally for our internal scheduler.

Use the role's metadata for this:

role.metadata["kubernetes"]["PodSpec"]={
  "nodeSelector": ...
}

kiukchung avatar Oct 15 '25 20:10 kiukchung

I think adding a generic option here sounds fine as an escape hatch -- I'm also fine with adding it as runopts for common use cases

d4l3k avatar Oct 15 '25 23:10 d4l3k

Thanks @kiukchung and @d4l3k! Let me process this. I don't want us to add runopts for each and every option k8s has though, I think a generic solution should be more flexible and mainable (against possible changes/customization in k8s spec).

clumsy avatar Oct 16 '25 17:10 clumsy