feat(scheduledsparkapplication): configurable timestampPrecision (nan…
feat(scheduledsparkapplication): add configurable timestampPrecision for run name generation
This PR introduces a new field .spec.timestampPrecision that allows users to control the precision of the timestamp suffix added to SparkApplication names generated by ScheduledSparkApplication. Supported values are:
nanos | micros | millis | seconds | minutes
The default remains nanos for full backward compatibility.
Summary
Previously, scheduled runs always used time.UnixNano() to generate the timestamp suffix, producing a 19-digit value. When combined with long application names, this frequently caused the run name to exceed Kubernetes’ 63-character limit.
This PR makes the timestamp precision configurable so users can choose a shorter suffix if needed. A new minutes option is also added to match Kubernetes CronJob controller behavior, which only schedules in minute granularity.
Key Changes
-
API / CRD
- Added new optional field
.spec.timestampPrecisiontoScheduledSparkApplicationSpec - Enum validation:
nanos,micros,millis,seconds,minutes - Default:
"nanos"(preserves current behavior) - Regenerated CRDs using
make generate+make manifests
- Added new optional field
-
Controller
- Added
formatTimestamp()helper to format timestamps according to the selected precision - Updated run-name generation to use this helper
minutesmode computesUnix()/60to stay consistent with CronJob naming semantics
- Added
-
Tests
- Added
format_timestamp_test.goto validate timestamp length for all supported precisions - Updated
envtestsetup helper script for contributors
- Added
-
Helm Chart
- Added optional value
controller.scheduledSparkApplication.timestampPrecision - Defaults to
"nanos"
- Added optional value
Why This Is Needed (Fixes #2602)
Users with long application names often hit Kubernetes’ 63-character name limit because the operator always appends a 19-digit nanosecond timestamp.
Allowing precision selection reduces the suffix size:
| Precision | Digits | Example Use Case |
|---|---|---|
| minutes | ~8 | Matches CronJob granularity; shortest valid suffix |
| seconds | 10 | Common for hourly/daily jobs |
| millis | 13 | High-rate jobs |
| micros | 16 | Advanced workloads |
| nanos | 19 | Current behavior |
This gives users flexibility while keeping backward compatibility.
How I Tested
gofmt -s -w .
make generate
make manifests
# setup envtest
bash scripts/setup-envtest-binaries.sh
export KUBEBUILDER_ASSETS="$(pwd)/bin/k8s/v1.32.0-linux-amd64"
# run unit tests
go test ./internal/controller/scheduledsparkapplication -v
- Verified CRD generated correctly (enum + default)
- Ensured all timestamp precisions produce expected digit lengths
- Confirmed controller creates SparkApplication names with correct suffixes
Checklist:
- [x] Self-review completed
- [x] CRD + controller changes implemented
- [x] Unit tests added and passing
- [x] Helm values updated
- [x] Backward compatibility preserved
Hi @rahul810050 , you do not have to create a new PR for every update. You can run git push command by adding an extra --force flag.
Hi @rahul810050 , you do not have to create a new PR for every update. You can run git push command by adding an extra
--forceflag.
sure @ChenYi015 i will keep it in mind...thanks!
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from chenyi015. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Some changes still need to be made:
- We do not have to remove
controller.scheduledSparkApplicationfromvalues.yaml. - Add a new flag parameter named
ScheduledSATimestampPrecision, like: https://github.com/kubeflow/spark-operator/blob/f53373e7e94ec6eb69d2f6829e61e9c9497e71d6/cmd/operator/controller/start.go#L86-L96 - Then pass the value to
scheduledsparkapplication.Options, like: https://github.com/kubeflow/spark-operator/blob/f53373e7e94ec6eb69d2f6829e61e9c9497e71d6/cmd/operator/controller/start.go#L443-L448
Some changes still need to be made:
- Changes to file
api/v1beta2/scheduledsparkapplication_types.goandconfig/crd/bases/sparkoperator.k8s.io_scheduledsparkapplications.yamlshould be discarded since we decide not to modifyScheduledSparkApplicationCRD. - File
scripts/setup-envtest-binaries.shshould be removed and I have created a PR #2751 to fixmake unit-testdoes not work on Arch linux. - File
internal/controller/scheduledsparkapplication/format_timestamp_test.goshould be removed because the added unit tests have been moved tointernal/controller/scheduledsparkapplication/controller_test.go. - Comments for
controller.scheduledSparkApplication.timestampPrecisioninvalues.yamlneed to be updated as suggested by the previsous review comments. - Run
make generate,make manifestsandmake helm-docsand commit the changes.
Some changes still need to be made:
- Changes to file
api/v1beta2/scheduledsparkapplication_types.goandconfig/crd/bases/sparkoperator.k8s.io_scheduledsparkapplications.yamlshould be discarded since we decide not to modifyScheduledSparkApplicationCRD.- File
scripts/setup-envtest-binaries.shshould be removed and I have created a PR Update ENVTEST_K8S_VERSION for e2e and unit tests #2751 to fixmake unit-testdoes not work on Arch linux.- File
internal/controller/scheduledsparkapplication/format_timestamp_test.goshould be removed because the added unit tests have been moved tointernal/controller/scheduledsparkapplication/controller_test.go.- Comments for
controller.scheduledSparkApplication.timestampPrecisioninvalues.yamlneed to be updated as suggested by the previsous review comments.- Run
make generate,make manifestsandmake helm-docsand commit the changes.
Hii @ChenYi015 !! thanks for the clarification!! I will update the PR soon
@ChenYi015 could you please review it ??