training-operator icon indicating copy to clipboard operation
training-operator copied to clipboard

refactor(manifests): use named container ports for better maintainability

Open ChughShilpa opened this issue 2 months ago • 6 comments

The current manifests use hardcoded port numbers in probes and service definitions. This makes it harder to maintain and understand the configuration at a glance.

Proposed changes: Add explicit containerPort declarations with named ports and update probes and service to use named ports.

This makes the manifests more self-documenting and easier to maintain

ChughShilpa avatar Dec 30 '25 13:12 ChughShilpa

🎉 Welcome to the Kubeflow Trainer! 🎉

Thanks for opening your first issue! We're happy to have you as part of our community 🚀

Here's what happens next:

  • Our team will review your issue soon! cc @kubeflow/kubeflow-trainer-team
  • If you'd like to contribute to this issue, check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards.

Join the community:

Feel free to ask questions in the comments if you need any help or clarification! Thanks again for contributing to Kubeflow! 🙏

github-actions[bot] avatar Dec 30 '25 13:12 github-actions[bot]

Thanks for creating this @ChughShilpa, do we really need to have those ports defined in the Deployment manifests? We haven't done it before, and users didn't complain. I feel like users can always check service if they want to verify port number.

andreyvelich avatar Jan 03 '26 21:01 andreyvelich

Naming ports isn't strictly necessary, but it can be quite nice when a container exposes a number of ports - you use a name in readiness/liveness probes and the service. It might be handy with #2905 if we end up adding another port for the trainer status messages.

robert-bell avatar Jan 05 '26 17:01 robert-bell

Hello

I checked the manifests in this repo and found that all containerPort occurrences are currently only present in CRDs under manifests/base/crds/.

I didn’t find any workload manifests with hardcoded container ports to refactor.

Could you please confirm:

  • which manifest files should be updated, or
  • if this issue refers to a different directory or branch?

Happy to work on this once clarified. Thanks!

Goku2099 avatar Jan 15 '26 10:01 Goku2099

Hi @Goku2099,

This issue is referring to the controller deployment and service manifests in manifests/base/manager/manager.yaml (and the equivalent manifests in the helm chart).

There's some extra details on this PR https://github.com/kubeflow/trainer/pull/3056#discussion_r2645309982 which triggered this issue, but essentially the suggestion was to

  1. add the ports to the deployment, as in the linked PR
  2. use the port names in the deployment probes and the service rather than the numbers.

I don't think there's consensus that the change is necessary but any thoughts you have would be very welcome.

robert-bell avatar Jan 16 '26 14:01 robert-bell

hello @robert-bell

Thanks for the clarification, that helps a lot.

I’ll take a look at manifests/base/manager/manager.yaml and the corresponding Helm templates, and follow the approach suggested in PR #3056 — adding named container ports and updating probes and Services to reference those names.

I’ll share a PR once I have something ready. /assign

Goku2099 avatar Jan 16 '26 21:01 Goku2099