descheduler feat: Add per-plugin nodeSelector support

Adds nodeSelector field to HighNodeUtilizationArgs and LowNodeUtilizationArgs to allow filtering nodes per plugin.

This enables different plugins in the same profile to target different sets of nodes based on labels, solving the limitation where only global nodeSelector was available.

Use case: Run HighNodeUtilization only on nodes blocked by cluster-autoscaler while other plugins run on all nodes.

Changes:

Add NodeSelector field to both plugin Args structs
Implement node filtering logic in Balance() methods
Add nodeMatchesSelector helper function
Generate deepcopy code for new fields
Add comprehensive tests for nodeSelector logic

Addresses: #1486

Description

Checklist

Please ensure your pull request meets the following criteria before submitting for review, these items will be used by reviewers to assess the quality and completeness of your changes:

[ ] Code Readability: Is the code easy to understand, well-structured, and consistent with project conventions?
[ ] Naming Conventions: Are variable, function, and structs descriptive and consistent?
[ ] Code Duplication: Is there any repeated code that should be refactored?
[ ] Function/Method Size: Are functions/methods short and focused on a single task?
[ ] Comments & Documentation: Are comments clear, useful, and not excessive? Were comments updated where necessary?
[ ] Error Handling: Are errors handled appropriately ?
[ ] Testing: Are there sufficient unit/integration tests?
[ ] Performance: Are there any obvious performance issues or unnecessary computations?
[ ] Dependencies: Are new dependencies justified ?
[ ] Logging & Monitoring: Is logging used appropriately (not too verbose, not too silent)?
[ ] Backward Compatibility: Does this change break any existing functionality or APIs?
[ ] Resource Management: Are resources (files, connections, memory) managed and released properly?
[ ] PR Description: Is the PR description clear, providing enough context and explaining the motivation for the change?
[ ] Documentation & Changelog: Are README and docs updated if necessary?

Oct 27 '25 13:10 SarCode

The committers listed above are authorized under a signed CLA.

:white_check_mark: login: SarCode / name: Sarthak Agarwal (3a24013acb21055a3dc09f2d168cab8411400078, 5fb59119ce2d7149a66b1f4d0e9e9f36533368ba, fc9d6ed919c3483cc877c6ba0cb16339f1dfac85)

Oct 27 '25 13:10 linux-foundation-easycla[bot]

Welcome @SarCode!

It looks like this is your first PR to kubernetes-sigs/descheduler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/descheduler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

Oct 27 '25 13:10 k8s-ci-robot

Hi @SarCode. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Oct 27 '25 13:10 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign ingvagabund for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Oct 27 '25 13:10 k8s-ci-robot

/ok-to-test

Oct 28 '25 09:10 SarCode

@SarCode: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Oct 28 '25 09:10 k8s-ci-robot

Apologies if I am being too eager but is there a way I could get a review on this, it's my first open source contribution so really excited for this 😁

@googs1025 or @ingvagabund maybe?

Oct 28 '25 09:10 SarCode

/ok-to-test

Oct 29 '25 01:10 googs1025

/retest

Oct 29 '25 09:10 SarCode

/retest

Oct 29 '25 09:10 SarCode

Formatting issues fixed 😃

Oct 29 '25 10:10 SarCode

@googs1025, the tests went through 👌

Oct 30 '25 09:10 SarCode

@SarCode thank you for your first contribution here and welcome to the open source community.

The current approach is less practical since only two plugin will benefit from it. Also, each plugin will need to specify the node selector which is more suitable for a profile level field. Saying that I recommend to take the more generic approach and define a node selector field on a profile level.

Oct 30 '25 15:10 ingvagabund

Thank you for the kind words @ingvagabund !

I was thinking of the same thing, but I thought it would be easier if I contained it to just Utilization use-case, I will make some changes and make it more generic 👌

Oct 30 '25 15:10 SarCode

What if we add optional profile-level nodeSelector while keeping existing plugin-level support:

type DeschedulerProfile struct {
    Name          string         `json:"name"`
    PluginConfigs []PluginConfig `json:"pluginConfig"`
    Plugins       Plugins        `json:"plugins"`
    NodeSelector  map[string]string `json:"nodeSelector,omitempty"` // NEW
}

Will look something like this

profiles:
- name: default
  nodeSelector:           # Applies to ALL plugins in profile
    node-type: worker
  pluginConfig:
  - name: LowNodeUtilization
    args: {}              # No plugin-level nodeSelector needed

Oct 30 '25 15:10 SarCode

NodeSelector needs to be a string to keep with the current globally scoped node selector.

Oct 30 '25 16:10 ingvagabund

What if we add optional profile-level nodeSelector while keeping existing plugin-level support

Would there be any use for the plugin-level node selector?

Oct 30 '25 16:10 ingvagabund

Thank you for your contribution since I barely have time to fix it... I will review it recently as soon as possible, too.

Oct 31 '25 07:10 songtao98

What if we add optional profile-level nodeSelector while keeping existing plugin-level support

Would there be any use for the plugin-level node selector?

Agree. Just profile-level node selector seems to be ok. Because from my perspective, if someone wants to configure LowNodeUtilization with "no" nodeSelector while configure another plugin with "some" nodeSelector, it's better devided into 2 seperate profiles.

Oct 31 '25 07:10 songtao98

If I understand this correctly, the descheduler should run with separate profiles if they want different nodeSelector?

apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
profiles:
# Profile 1: For worker nodes only
- name: "worker-nodes"
  nodeSelector: "node-type=worker"  # Profile-level nodeSelector
  pluginConfig:
  - name: "LowNodeUtilization"
    args:
      thresholds:
        cpu: 20
        memory: 20
      targetThresholds:
        cpu: 50
        memory: 50
  plugins:
    deschedule:
      enabled:
      - "LowNodeUtilization"

# Profile 2: For master nodes only  
- name: "master-nodes"
  nodeSelector: "node-type=master"  # Different nodeSelector
  pluginConfig:
  - name: "HighNodeUtilization"
    args:
      thresholds:
        cpu: 80
        memory: 80
  plugins:
    balance:
      enabled:
      - "HighNodeUtilization"

Does descheduler supports this right now? cause if I remember correctly nodeSelector is global for all profiles

Oct 31 '25 09:10 SarCode

Looking at the docs it seems like profile right now does have a nodeSelector config

create your own Evictor Plugin or use the Default one provided by Descheduler. Other uses for the Evictor plugin can be to sort, filter, validate or group pods by different criteria, and that's why this is handled by a plugin and not configured in the top level config

https://github.com/kubernetes-sigs/descheduler?tab=readme-ov-file#evictor-plugin-configuration-default-evictor

Oct 31 '25 09:10 SarCode

If I understand this correctly, the descheduler should run with separate profiles if they want different nodeSelector?

apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
profiles:
# Profile 1: For worker nodes only
- name: "worker-nodes"
  nodeSelector: "node-type=worker"  # Profile-level nodeSelector
  pluginConfig:
  - name: "LowNodeUtilization"
    args:
      thresholds:
        cpu: 20
        memory: 20
      targetThresholds:
        cpu: 50
        memory: 50
  plugins:
    deschedule:
      enabled:
      - "LowNodeUtilization"

# Profile 2: For master nodes only  
- name: "master-nodes"
  nodeSelector: "node-type=master"  # Different nodeSelector
  pluginConfig:
  - name: "HighNodeUtilization"
    args:
      thresholds:
        cpu: 80
        memory: 80
  plugins:
    balance:
      enabled:
      - "HighNodeUtilization"

Does descheduler supports this right now? cause if I remember correctly nodeSelector is global for all profiles

Yes, your example is just what I want to implement and proposed inside #1486 . I don't think descheduler supports this right now. That's why we implement it here :)

Oct 31 '25 09:10 songtao98

Looking at the docs it seems like profile right now does have a nodeSelector config

create your own Evictor Plugin or use the Default one provided by Descheduler. Other uses for the Evictor plugin can be to sort, filter, validate or group pods by different criteria, and that's why this is handled by a plugin and not configured in the top level config

https://github.com/kubernetes-sigs/descheduler?tab=readme-ov-file#evictor-plugin-configuration-default-evictor

If I understand it right, this is a plugin-level nodeSelector only for DefaultEvictor.

Oct 31 '25 09:10 songtao98

Looking at the docs it seems like profile right now does have a nodeSelector config

create your own Evictor Plugin or use the Default one provided by Descheduler. Other uses for the Evictor plugin can be to sort, filter, validate or group pods by different criteria, and that's why this is handled by a plugin and not configured in the top level config

https://github.com/kubernetes-sigs/descheduler?tab=readme-ov-file#evictor-plugin-configuration-default-evictor

If I understand it right, this is a plugin-level nodeSelector only for default evictor

I tried the two profile config with different nodeSelector and it worked

apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
profiles:
  # Profile 1: HighNodeUtilization targeting only worker nodes
  - name: high-utilization-workers
    pluginConfig:
      - name: DefaultEvictor
        args:
          nodeSelector: "type=worker"  # Only process worker nodes
          evictLocalStoragePods: true
          ignorePvcPods: true
          nodeFit: true
      - name: HighNodeUtilization
        args:
          thresholds:
            cpu: 50
            memory: 50
    plugins:
      balance:
        enabled:
          - HighNodeUtilization

  # Profile 2: Other plugins targeting all non-control-plane nodes
  - name: general-policies
    pluginConfig:
      - name: DefaultEvictor
        args:
          nodeSelector: "!node-role.kubernetes.io/control-plane"  # All nodes except control-plane
          evictLocalStoragePods: true
          ignorePvcPods: true
          nodeFit: true
      - name: RemovePodsViolatingNodeAffinity
        args:
          nodeAffinityType:
            - requiredDuringSchedulingIgnoredDuringExecution
      - name: RemovePodsViolatingNodeTaints
        args:
          includePreferNoSchedule: true
      - name: RemovePodsViolatingInterPodAntiAffinity
      - name: RemovePodsViolatingTopologySpreadConstraint
    plugins:
      balance:
        enabled:
          - RemovePodsViolatingTopologySpreadConstraint
      deschedule:
        enabled:
          - RemovePodsViolatingNodeTaints
          - RemovePodsViolatingNodeAffinity
          - RemovePodsViolatingInterPodAntiAffinity

The logs confirmed:

Profile 1 correctly processed only worker nodes for HighNodeUtilization Profile 2 correctly processed all non-control-plane nodes for other plugins

Oct 31 '25 13:10 SarCode

Looking at the docs it seems like profile right now does have a nodeSelector config

create your own Evictor Plugin or use the Default one provided by Descheduler. Other uses for the Evictor plugin can be to sort, filter, validate or group pods by different criteria, and that's why this is handled by a plugin and not configured in the top level config

https://github.com/kubernetes-sigs/descheduler?tab=readme-ov-file#evictor-plugin-configuration-default-evictor

If I understand it right, this is a plugin-level nodeSelector only for default evictor

I tried the two profile config with different nodeSelector and it worked
apiVersion: "descheduler/v1alpha2"
kind: "DeschedulerPolicy"
profiles:
  # Profile 1: HighNodeUtilization targeting only worker nodes
  - name: high-utilization-workers
    pluginConfig:
      - name: DefaultEvictor
        args:
          nodeSelector: "type=worker"  # Only process worker nodes
          evictLocalStoragePods: true
          ignorePvcPods: true
          nodeFit: true
      - name: HighNodeUtilization
        args:
          thresholds:
            cpu: 50
            memory: 50
    plugins:
      balance:
        enabled:
          - HighNodeUtilization

  # Profile 2: Other plugins targeting all non-control-plane nodes
  - name: general-policies
    pluginConfig:
      - name: DefaultEvictor
        args:
          nodeSelector: "!node-role.kubernetes.io/control-plane"  # All nodes except control-plane
          evictLocalStoragePods: true
          ignorePvcPods: true
          nodeFit: true
      - name: RemovePodsViolatingNodeAffinity
        args:
          nodeAffinityType:
            - requiredDuringSchedulingIgnoredDuringExecution
      - name: RemovePodsViolatingNodeTaints
        args:
          includePreferNoSchedule: true
      - name: RemovePodsViolatingInterPodAntiAffinity
      - name: RemovePodsViolatingTopologySpreadConstraint
    plugins:
      balance:
        enabled:
          - RemovePodsViolatingTopologySpreadConstraint
      deschedule:
        enabled:
          - RemovePodsViolatingNodeTaints
          - RemovePodsViolatingNodeAffinity
          - RemovePodsViolatingInterPodAntiAffinity
The logs confirmed:

Profile 1 correctly processed only worker nodes for HighNodeUtilization Profile 2 correctly processed all non-control-plane nodes for other plugins

@SarCode Thanks for your verification! But just as I described, this use-case is trying to configure DefaultEvictor in 2 profiles. The nodeSelector is DefaultEvictor's argument. It's a tricky way to meet the initial requirement because now only DefaultEvictor implements EvictorPlugin interface. If someday we have a "FooEvictor" that implements EvictorPlugin, then the "FooEvictor" must supports nodeSelector as well.

So what I hope to do is add nodeSelector in DeschedulerProfile maybe like this:

type DeschedulerProfile struct {
	Name          string
	PluginConfigs []PluginConfig
	Plugins       Plugins
    NodeSelector *metav1.LabelSelector (or *string)
}

This can prevent that every plugin need to support nodeSelector one by one.

Nov 05 '25 04:11 songtao98

My intention behind this all is to run HighNodeUtilization only on nodes which can't be removed by cluster-autoscaler maybe due to local-storage or whatever.

Meanwhile all other plugins can run on all nodes.

Nov 05 '25 11:11 SarCode

NodeSelector on the profile level can skip many nodes. Whereas, if it is used through the DefaultEvictor all nodes still gets processed. Which will not be that effective in big clusters.

Nov 10 '25 10:11 ingvagabund