training-operator icon indicating copy to clipboard operation
training-operator copied to clipboard

Add Initialized and ComponentsCreated conditions to TrainJob API

Open dineshkolhe1 opened this issue 9 months ago • 2 comments

This PR adds Initialized and ComponentsCreated conditions to the TrainJob API to better track the state of job creation and initialization.

  • Added Initialized condition to indicate when the TrainJob has been initialized.
  • Added ComponentsCreated condition to indicate when the components (e.g., JobSet, Jobs) have been created.
  • Updated the controller logic to set these conditions during reconciliation.

dineshkolhe1 avatar Mar 01 '25 16:03 dineshkolhe1

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign tenzen-y for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Mar 01 '25 16:03 google-oss-prow[bot]

@ tenzen-y
can you check my PR where I added some feature .

dineshkolhe1 avatar Mar 01 '25 17:03 dineshkolhe1

Hi @dineshkolhe1, as we discussed in this issue we would like to remove the Created condition for now and only keep Complete and Failed: https://github.com/kubeflow/trainer/issues/2459 So we won't introduce the breaking changes in the future:

Please can you help us with that ?

cc @tenzen-y @Electronic-Waste @astefanutti

andreyvelich avatar Apr 23 '25 02:04 andreyvelich

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 22 '25 05:07 github-actions[bot]

This pull request has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

github-actions[bot] avatar Aug 11 '25 05:08 github-actions[bot]