Kai-Hsun Chen
Kai-Hsun Chen
cc @angelinalg or @can-anyscale would you mind merging this PR? Thanks!
cc @andrewsykim @rueian would you mind taking a look?
Hey folks, I will address comments by updating the Google doc. I will sync this PR with the Google doc periodically.
The error seems to be: ``` ray.exceptions.RaySystemError: System error: Failed to create placement group '37f2cda36162ae36e11df8a30a0901000000' because name 'global_poolverl_group_4:0' already exists. ``` instead of OOM.
@eric-haibin-lin would you mind adding a label `ray` so that I can track the progress?
Offline discussion: `num_matching_resource_types` is useless. @rueian will open a follow up PR to remove it.
Offline discussion: # Issue statement * If a node type has running instances, the instance's `memory` and `object_store_memory` will be added to `node_types[node_type]["resources"]`, unless the user has explicitly specified them...
KubeRay v1.1.0 has already solved this issue.
After discussing this offline, we decided not to support the removal of a worker group from RayCluster for the following reasons: * The implementation is complex. * There have been...
@HarryCaveMan is this related to https://github.com/ray-project/ray/pull/48924?