enhancements icon indicating copy to clipboard operation
enhancements copied to clipboard

Mutable scheduling directives for suspended Jobs

Open ahg-g opened this issue 4 years ago • 29 comments

Enhancement Description

  • One-line enhancement description (can be used as a release note): make node affinity mutable for jobs
  • Kubernetes Enhancement Proposal:
  • Discussion Link: https://github.com/kubernetes/kubernetes/issues/104714
  • Primary contact (assignee): @ahg-g
  • Responsible SIGs: scheduling
  • Enhancement target (which target equals to which milestone):
    • Beta release target (x.y): 1.23
    • Stable release target (x.y):
  • [x] Beta
    • [x] KEP (k/enhancements) update PR(s): https://github.com/kubernetes/enhancements/pull/2931
    • [x] Code (k/k) update PR(s): https://github.com/kubernetes/kubernetes/pull/105479
    • [x] Docs (k/website) update PR(s): https://github.com/kubernetes/website/pull/30390
  • [ ] Stable
    • [ ] KEP (k/enhancements) update PR(s): https://github.com/kubernetes/enhancements/pull/3709
    • [ ] Code (k/k) update PR(s):
    • [ ] Docs (k/website) update(s):

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

ahg-g avatar Sep 01 '21 19:09 ahg-g

/sig scheduling /sig apps

ahg-g avatar Sep 01 '21 19:09 ahg-g

/milestone v1.23

salaxander avatar Sep 02 '21 17:09 salaxander

Hi @ahg-g! 1.23 Enhancements team here. Just checking in as we approach enhancements freeze on Thursday 09/09. Here's where this enhancement currently stands:

  • [ ] KEP file using the latest template has been merged into the k/enhancements repo.
  • [X] KEP status is marked as implementable
  • [X] KEP has a test plan section filled out.
  • [X] KEP has up to date graduation criteria.
  • [ ] KEP has a production readiness review that has been completed and merged into k/enhancements.

Looks like for this one we would just need the KEP & the PRR file to merge by enhancements freeze :)

Thanks!

Priyankasaggu11929 avatar Sep 03 '21 13:09 Priyankasaggu11929

Hello @ahg-g! 1.23 Enhancements team here. Just checking in once again, as we approach enhancements freeze on Thursday 09/09.

We just need the KEP & the PRR file to merge by enhancements freeze, to be tracked under the Kubernetes 1.23 release :)

Thanks!

Priyankasaggu11929 avatar Sep 07 '21 12:09 Priyankasaggu11929

@PriyankaH21 KEP merged including PRR.

ahg-g avatar Sep 09 '21 13:09 ahg-g

Thank you so much, @ahg-g. With the KEP PR merged, this enhancement is ready for the 1.23 enhancements freeze. :)

Priyankasaggu11929 avatar Sep 09 '21 13:09 Priyankasaggu11929

Hi @ahg-g :wave: 1.23 Docs shadow here.

This enhancement is marked as Needs Docs for the 1.23 release.

Please follow the steps detailed in the documentation to open a PR against the dev-1.23 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thu November 18, 11:59 PM PDT.

Also, if needed take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thanks!

nate-double-u avatar Sep 17 '21 23:09 nate-double-u

Hello @ahg-g 👋

Checking in once more as we approach 1.23 code freeze at 6:00 pm PST on Tuesday, November 16.

Please ensure the following items are completed:

  • All PRs to the Kubernetes repo that are related to your enhancement are linked in the above issue description (for tracking purposes).
  • All PRs are fully merged by the code freeze deadline.
  • Have a documentation placeholder PR open by Thursday, November 18.

As always, we are here to help should questions come up.

Thank you so much! 🙂

Priyankasaggu11929 avatar Nov 08 '21 14:11 Priyankasaggu11929

I opened a docs PR at https://github.com/kubernetes/website/pull/30390

ahg-g avatar Nov 08 '21 19:11 ahg-g

docs PR merged

ahg-g avatar Nov 10 '21 17:11 ahg-g

Thanks so much for the update, @ahg-g

Priyankasaggu11929 avatar Nov 11 '21 05:11 Priyankasaggu11929

Should we consider schedulerName as well for the next release?

alculquicondor avatar Dec 08 '21 15:12 alculquicondor

can you state the use case

ahg-g avatar Dec 08 '21 15:12 ahg-g

The queueing system could decide that the job should land in specific Nodes, and these nodes would like their workloads to be scheduled under a specific scheduling profile.

alculquicondor avatar Dec 08 '21 19:12 alculquicondor

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 09 '22 18:04 k8s-triage-robot

/remove-lifecycle stale

ahg-g avatar Apr 09 '22 20:04 ahg-g

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jul 08 '22 20:07 k8s-triage-robot

/remove-lifecycle stale

kerthcet avatar Jul 09 '22 02:07 kerthcet

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 07 '22 03:10 k8s-triage-robot

/remove-lifecycle stale

kerthcet avatar Oct 07 '22 05:10 kerthcet

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 05 '23 06:01 k8s-triage-robot

/remove-lifecycle stale

ahg-g avatar Jan 05 '23 06:01 ahg-g

@ahg-g: The provided milestone is not valid for this repository. Milestones in this repository: [v1.24, v1.25, v1.26, v1.27]

Use /milestone clear to clear the milestone.

In response to this:

/milestone v.127

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 05 '23 06:01 k8s-ci-robot

/milestone v1.27

ahg-g avatar Jan 05 '23 06:01 ahg-g

/label lead-opted-in

ahg-g avatar Jan 05 '23 07:01 ahg-g

For GA, we should reframe the docs for Job to frame it differently: the first moment that a Job is not suspended is the point at which certain fields become immutable (or perhaps frozen). That's what end-users would experience.

For some Jobs, this is at the point admission completes, because it is admitted unsuspended; for other Jobs, it could plausibly be a day or so later once the Job unsuspends. Does that sound right?

sftim avatar Jan 18 '23 10:01 sftim

I am not sure I follow what changes you would like to see to https://kubernetes.io/docs/concepts/workloads/controllers/job/#mutable-scheduling-directives; the docs state the semantics clearly: "This is allowed only for suspended Jobs that have never been unsuspended before."; I guess we can move this sentence to the beginning of the section?

ahg-g avatar Jan 18 '23 21:01 ahg-g

The changes I'm thinking of are to the structure of the docs. For example, the heading might change. Once a feature is GA, or sometimes before, we shift to describing Kubernetes almost as if that feature was always part of the API.

SIG Docs can help review draft PRs, etc, here.

sftim avatar Jan 19 '23 08:01 sftim

/stage stable /milestone v1.27 /label lead-opted-in

ahg-g avatar Jan 23 '23 15:01 ahg-g