volcano icon indicating copy to clipboard operation
volcano copied to clipboard

Support identifying network topology from node labels and converted into hyperNode resources

Open Lily922 opened this issue 1 year ago • 5 comments

Which issue(s) this PR fixes:

https://github.com/volcano-sh/volcano/issues/4145

What type of PR is this?

Add a controller to manage the automatic creation of hypernodes.

The workflow of this controller can be summarized in the following steps:

  1. Configuration Loading The controller loads the configuration from a specific ConfigMap named volcano-networktopologyaware-configuration, which includes the key-value pairs for the node labels to be recognized.

  2. Periodic Sync and Hypernode Resource Check At regular intervals(1s default), the controller synchronizes and checks the current state of Hypernode resources to ensure they align with the expected configuration based on the node labels and the ConfigMap.

  3. Automatic Hypernode Creation or Adjustment If the controller finds that the Hypernodes do not match the expected configuration (for example, a required Hypernode is missing or misconfigured), it automatically creates, deletes, or adjusts the Hypernode resources to bring them into compliance with the desired state.

Example: Using the Controller for Automatic Hypernode Creation:

  1. To configure the node label keys should be recognized, define them through the node_label_based_topologies_override values in the values.yaml file of charts. The configuration must always follow a strict hierarchy, from the lowest layer to the highest layer. For example:
- name: topology1
  topo-level-keys:
    - network-tier1
    - network-tier2
    - network-tier3

This configuration tells the controller to look for nodes with the label keys network-tier1network-tier2network-tier3

Warning: Ensure the order of the keys is correct. Incorrect ordering of the keys will generate a topology that does not match your expectations.

  1. Assume you have k8s nodes with following labels:
lables:
  network-tier1: s0
  network-tier2: s4
  network-tier3: s6

nodes with following labels in this hypernode network with 3 tiers: image

  1. Volcano will automatically create hypernode resources topology1-t1-s0-xxxx, topology1-t1-s1-xxxx, topology1-t1-s2-xxxx, topology1-t1-s3-xxxx, topology1-t2-s4-xxxx, topology1-t2-s5-xxxx, topology1-t3-s7-xxxx Example:hypernode7
apiVersion: topology.volcano.sh/v1alpha1
kind: HyperNode
metadata:
  labels:
    volcano.io/label-based-topology-name: topology1
  name: topology1-t3-s6-xxxx
spec:
  members:
  - selector:
      exactMatch:
        name: topology1-t2-s4-xxxx
    type: HyperNode
  - selector:
      exactMatch:
        name: topology1-t2-s5-xxxx
    type: HyperNode
  tier: 3

Lily922 avatar Mar 26 '25 07:03 Lily922

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign shinytang6 You can assign the PR to them by writing /assign @shinytang6 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

volcano-sh-bot avatar Mar 26 '25 07:03 volcano-sh-bot

Hi,please add more details: )

Monokaix avatar Mar 26 '25 08:03 Monokaix

Need to add a design doc

JesseStutler avatar Mar 28 '25 03:03 JesseStutler

@Lily922: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

volcano-sh-bot avatar May 02 '25 01:05 volcano-sh-bot

auto discovery semantic seems not so proper, auto conversion seems better.

Monokaix avatar May 12 '25 01:05 Monokaix