ray
ray copied to clipboard
[Core] Dynamic Node Labeling and Resource Registration
Description
right now, Ray provides node labeling and resource registration when calling ‘ray start --label/--custom-resource‘ or using RAY_OVERRIDE_RESOURCES/RAY_OVERRIDE_LABELS env var. As the Ray Cluster's lifecycle become longer, we hope we can label the node dynamic, and apply nodeLabelSelector to dispatch the actors or tasks
Use case
No response
What's your use case for dynamic labels? I think today you can create a new node with new labels, and shut down the old node.
yeah, if the workload is submitted using RayJob, using an ephemeral Ray Cluster, we can add the custom label at starting. But if we have a long-running Ray Cluster, that won't work.
if we want to implement this, this can be achieved by using RaySyncer? Syncer will trigger NodeManager, do ConsumeSyncMessage -> UpdateResourceUsage -> UpdateNode -> ClusterResourceManager.AddOrUpdateNode. eventually update NodeResources data Structure (which contains labels)