ScheduledSparkApplication controller ignores schedule spec changes, requiring to edit the subresource status.nextRun
What happened?
Description: ScheduledSparkApplication controller ignores schedule spec changes and requires manual status subresource editing to apply new schedules.
Workaround: Edit status subresource and manually set nextRun time to our desired start time. I suppose you could also set scheduledState to "New".
Controller Code Issue: The controller in pkg/controller/scheduledsparkapplication/controller.go lacks spec change detection logic and only recalculates nextRun when status.nextRun is zero. Not sure if this is intended or not for some reason I am not seeing
Possibly this pattern could be helpful? https://alenkacz.medium.com/kubernetes-operator-best-practices-implementing-observedgeneration-250728868792
And then, in the switch case for when the state is already Scheduled, check that the generations match to determine if a recalculation of NextRunTime is in order: https://github.com/kubeflow/spark-operator/blob/master/internal/controller/scheduledsparkapplication/controller.go#L160
If you agree, I am happy to raise a PR
Reproduction Code
- Create a ScheduledSparkApplication with schedule "10 * * * *"
- Let it reach ScheduleStateScheduled
- Update spec.schedule to "15 * * * *", e.g.
k edit scheduledsparkapp <app-name>and edit the schedule - Observe that nextRun time is not recalculated
Expected behavior
- Controller should detect spec changes and recalculate status.nextRun automatically
- Should implement generation/observedGeneration pattern like other K8s controllers
Actual behavior
- Editing the main ScheduledSparkApplication spec.schedule field does not trigger recalculation of status.nextRun
- The controller continues using the old schedule timing stored in status.nextRun
- Only workaround is manually editing the status subresource
Impacted by this bug?
Give it a 👍 We prioritize the issues with most 👍
I am seeing this as well with v2.2.0 after upgrading from the v1 series of the operator. Similarly if the ScheduledSparkApplication transitions into a "FailedValidation" state, it is stuck and can't be edited in place to fix the validation issue without clearing the status.
The workaround @dk805 does work in the interim, however this would be great to have working again
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.
/reopen this is still an issue, and a really annoying one, you need to delete the resource to have the new schedule times take effect.
surely we can fix this?