rayfed icon indicating copy to clipboard operation
rayfed copied to clipboard

What's the meaning of the "cluster_config" ?

Open NKcqx opened this issue 2 years ago • 4 comments

          Now that we separate `ray.init` from `fed.init`,  there's no way to reach the cluster-level information, since each `fed.init` starts and only starts a job session.

Unless there's a global actor (or service job) that can break the job isolation and filter each job's tasks' invalid param type.

Originally posted by @NKcqx in https://github.com/ray-project/rayfed/pull/140#discussion_r1263583091

NKcqx avatar Jul 18 '23 03:07 NKcqx

          If we don't have the cluster config, why we not just use `config`? The question that we should answer before it getting finalized is whether we need the cluster config in the future at high level.

Originally posted by @jovany-wang in https://github.com/ray-project/rayfed/pull/140#discussion_r1263675724

NKcqx avatar Jul 18 '23 03:07 NKcqx

Firstly, the fact is that user can't configure the Ray cluster in a rayfed job, since the initialization of Ray cluster has separated from RayFed, i.e. fed.init. I think the original semantic of "cluster_config" is "configure the cluster used in this job", in which case, it's a job-level config but containing all the non-business configurations.

NKcqx avatar Jul 18 '23 03:07 NKcqx

I think the original semantic of "cluster_config" is "configure the cluster used in this job", in which case, it's a job-level config but containing all the non-business configurations.

It sounds reasonable. Let's use the config key word as the job level configurations parameter name.

jovany-wang avatar Jul 18 '23 03:07 jovany-wang

What's the status of this PR?

jovany-wang avatar Sep 14 '23 09:09 jovany-wang