api icon indicating copy to clipboard operation
api copied to clipboard

Proposal: Add TTL fields to Build and BuildConfig (automatic cleanup similar to Kubernetes Jobs)

Open jangel97 opened this issue 4 months ago • 0 comments

Introduce optional TTL (time-to-live) fields in the Build and BuildConfig APIs to automatically clean up finished builds after a configurable period, similar to ttlSecondsAfterFinished in Kubernetes Jobs.

Motivation

Today, OpenShift administrators and CI/CD users often need to implement custom automation (e.g., cronjobs or external scripts) to prune old builds. While OpenShift provides successfulBuildsHistoryLimit and failedBuildsHistoryLimit, those are count-based and not time-based.

This proposal would extend both Build and BuildConfig objects to include time-based retention semantics, providing more predictable and automated cleanup.

Proposed API changes

In build/v1/BuildSpec

// SuccessfulBuildTTLSeconds defines how long (in seconds) a successful build
// is retained after completion before being automatically deleted.
SuccessfulBuildTTLSeconds *int32 `json:"successfulBuildTTLSeconds,omitempty"`

// FailedBuildTTLSeconds defines how long (in seconds) a failed or errored build
// is retained after completion before being automatically deleted.
FailedBuildTTLSeconds *int32 `json:"failedBuildTTLSeconds,omitempty"`

In build/v1/BuildConfigSpec

// DefaultSuccessfulBuildTTLSeconds sets the default retention time (in seconds)
// for successful builds created from this BuildConfig.
DefaultSuccessfulBuildTTLSeconds *int32 `json:"defaultSuccessfulBuildTTLSeconds,omitempty"`

// DefaultFailedBuildTTLSeconds sets the default retention time (in seconds)
// for failed or errored builds created from this BuildConfig.
DefaultFailedBuildTTLSeconds *int32 `json:"defaultFailedBuildTTLSeconds,omitempty"`

These fields would mirror the semantics of Kubernetes’ ttlSecondsAfterFinished for Jobs.

Behavioral overview

  • A Build with SuccessfulBuildTTLSeconds or FailedBuildTTLSeconds set will be automatically deleted by the Build controller after the configured number of seconds elapses from completionTimestamp.
  • When unset, no automatic deletion occurs (backwards-compatible).
  • When defined in a BuildConfig, these defaults propagate to new Builds created from it.
  • The logic parallels Kubernetes’ Job TTL controller, ensuring consistency across cluster cleanup mechanisms.

Benefits

  • Reduces cluster bloat from thousands of completed builds.
  • Eliminates the need for external cron-based cleanup.
  • Aligns OpenShift’s Build subsystem with core Kubernetes resource lifecycle concepts.
  • Fully backward compatible.

Open questions

  • Should this feature be gated behind a FeatureGate (e.g., BuildTTLSeconds) during TechPreview?
  • Should the controller emit metrics/events when builds are TTL-deleted?
  • How does this interact with existing build pruning logic?

References

  • [Kubernetes JobSpec: ttlSecondsAfterFinished](https://kubernetes.io/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically)
  • successfulBuildsHistoryLimit and failedBuildsHistoryLimit in BuildConfigSpec

Next steps

If this proposal aligns with the API team’s direction, I’d be happy to prepare an OpenShift Enhancement Proposal (OEP) in openshift/enhancements to describe the full design and rollout plan.

jangel97 avatar Oct 24 '25 14:10 jangel97