k8s.io icon indicating copy to clipboard operation
k8s.io copied to clipboard

Cleanup staging buckets in boskos

Open ameukam opened this issue 2 years ago • 10 comments

Each GCP project in the boskos pool have a staging bucket used to upload binaries built during test execution.

Those buckets were never cleaned up and now impact our budget with the GCS pricing change.

We should clean up those buckets, make them regional and introduce some retention policy (7 days ?).

/sig testing /milestone v1.27

ameukam avatar Jan 27 '23 22:01 ameukam

cc @BenTheElder @thockin

ameukam avatar Jan 27 '23 22:01 ameukam

For a boskos project:

gcloud alpha storage buckets list --project k8s-infra-e2e-boskos-001 --format='table(name,locationType,location)'
NAME                                        LOCATION_TYPE  LOCATION
kubernetes-staging-485128143e               multi-region   US
kubernetes-staging-485128143e-asia          multi-region   ASIA
kubernetes-staging-485128143e-eu            multi-region   EU
kubernetes-staging-485128143e-europe-west6  region         EUROPE-WEST6
gsutil du -sh gs://kubernetes-staging-485128143e
288.45 GiB   gs://kubernetes-staging-485128143e

ameukam avatar Jan 27 '23 22:01 ameukam

I think we should set a small TTL of no more than 1 day and rotate these to regional.

We could delete-and-recreate on every run safely in boskos leased projects, but in other fixed projects where CI jobs are still sharing them (5k node scale testing maybe?) we can't do that as easily.

There might be an argument to just update the scripts that ensure they exist to use the new settings when creating and then do a mass deletion of existing buckets one evening ...

BenTheElder avatar Jan 30 '23 20:01 BenTheElder

#115634 is breaking CI by preventing object creation during job execution.

https://github.com/kubernetes/kubernetes/pull/116222 should help fix this.

At the same time I ran a quick script to clear retention policy around all the buckets. This should be enough fix the issue.

projects=$(gcloud projects list --filter='projectId~^k8s-infra-e2e-boskos-' --format="value(projectId)" --sort-by=projectId)

for prj in ${projects}; do
    bucket=$(gcloud storage buckets list --project "${prj}" --filter='location=us-central1' --format="value(name)")
    if [[ -n "${bucket}" ]]; then
        if ! gsutil ls "${bucket}" >/dev/null 2>&1; then
            echo "clearing retention policy for ${bucket} in ${prj}"
            gsutil retention clear "gs://${bucket}"
        fi
    fi
done

ameukam avatar Mar 02 '23 18:03 ameukam

thanks @ameukam

dims avatar Mar 02 '23 18:03 dims

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar May 31 '23 19:05 k8s-triage-robot

/remove-lifecycle stale

Will take a look in 1.29 to remove the multi-regional buckets. /milestone v1.29

ameukam avatar Jun 01 '23 07:06 ameukam

/kind /priority backlog /area infra/gcp /lifecycle frozen

ameukam avatar Feb 02 '24 14:02 ameukam

@ameukam: The label(s) kind/backlog cannot be applied, because the repository doesn't have them.

In response to this:

/kind /priority backlog /area infra/gcp /lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 02 '24 14:02 k8s-ci-robot

/milestone v1.32

ameukam avatar Mar 03 '24 14:03 ameukam