[feature] Can we use one headless service for one job?
We have ps/worker/chief for one TFJob. And now we create one headless service for one replica. I think we can use one headless service for easy-to-use.
After that, we could use {tfjob_name}-{replica_type}-{index}.{service_name}.svc.cluster.local in the code.
WDYT @johnugeorge @richardsliu
Issue-Label Bot is automatically applying the label improvement/enhancement to this issue, with a confidence of 0.70. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!
Links: app homepage, dashboard and code for this bot.
/area engprod /priority p2
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
/reopen We should take this to improve cluster performance.
@tenzen-y: Reopened this issue.
In response to this:
/reopen We should take this to improve cluster performance.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I realized this need by Aldo's comment.
cc: @kubeflow/wg-training-leads
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
/lifecycle frozen
@tenzen-y brought this up in brainstorming around jobset/kubeflow.
We have implemented a few ways to customize network names.
type Network struct {
// EnableDNSHostnames allows pods to be reached via their hostnames.
// Pods will be reachable using the fully qualified pod hostname:
// <jobSet.name>-<spec.replicatedJob.name>-<job-index>-<pod-index>.<subdomain>
// +optional
EnableDNSHostnames *bool `json:"enableDNSHostnames,omitempty"`
// Subdomain is an explicit choice for a network subdomain name
// When set, any replicated job in the set is added to this network.
// Defaults to <jobSet.name> if not set.
// +optional
Subdomain string `json:"subdomain,omitempty"`
}
Was what we used to control service creation for the jobset.
The suffix will differ from .svc.cluster.local according to the cluster settings. Maybe we could use a CLI parameter to config it.