test-infra
test-infra copied to clipboard
Prow control plane migration (k8s-prow > k8s-infra-prow)
Using this to track what's done/to-do, and communicate higher-traffic updates. Broader updates should go to https://groups.google.com/a/kubernetes.io/g/dev/c/qzNYpcN5la4.
Based on the proposal doc at https://docs.google.com/document/d/1erBhuCwY26d0UfPbzt8lEj6bYT2hOUKzc2j36YHVqfM.
Pre-migration:
- [x] ~~List and drop prowjobs that will not be migrated (k/t-i#33272)~~
- [x] ~~Add banner warning people about migration date (Wed, August 21) (Prow + TestGrid, done)~~
- [x] Add tracking issue, and communicate migration progress as it happens
- [x] Track down additional infra in SIG K8s Infra that may be using workload identity (I believe Ben did this during migration)
- [x] ~~Spin down Boskos use (k/t-i#33129)~~
- [x] ~~Ban use of default/unspecified cluster (k/t-i#33272)~~
- [x] Prepare a quick "scale all the old controllers to 0 and scale the new ones up" PR
- PR should update the deployment replicas; run the make target to manually deploy
- cd test-infra/config/prow
- make deploy-prow?
- (Rollback is just a revert to a previous Git commit + make target) - [x] Test new Prow with fake configmap (e.g. has a single job w/ "hello world")
- [x] ~~Ensure that some key people have access to both the Google and community projects~~
- [x] ~~Ben, Cole, and Michelle should or already have access on both (k8s-prow and k8s-infra-prow)~~
- [x] Prepare for switching Deck over
- ~~(not doing) Set up a new domain (e.g. k8s-infra-prow.k8s.io) pointing at the new Deck deployment~~
- [x] ~~Create Prow certificates (https://github.com/kubernetes/k8s.io/pull/7194)~~
Right before migration:
- [x] Drop remaining unmigrated Prow jobs https://github.com/kubernetes/test-infra/pull/33352 / https://github.com/kubernetes/test-infra/issues/33226
- [x] (ignore, no longer needed) Block changes to Prow and Prowjob config in test-infra (except people working on migration)
- [x] Sync logs from buckets to new buckets
- [x] ~~gs://kubernetes-jenkins (there's a transfer job running for this bucket to kubernetes-ci-logs)~~
- [x] Scale all the new controllers to 0.
- [x] Sync current configmap with new Prow
During migration:
Begins ~10:30am PT, Wednesday August 21
- [x] Scale down the old Prow
- [x] Copy prowjobs from old to new Prow
- [x] ~Trigger a final run of https://console.cloud.google.com/transfer/jobs/transferJobs%2Fkubernetes-jenkins-transfer/runs?project=k8s-infra-prow and delete it.~
- [x] Scale up the new Prow
- [x] Switch webhooks
- [x] Verify new Deck is working (watch jobs start and finish successfully)
- [x] Update DNS entries https://github.com/kubernetes/k8s.io/pull/7206
Post-migration:
- [ ] Debug and fix the external secrets instance. It seems to be getting stuck and is not syncing secrets to the cluster.
- [x] https://github.com/kubernetes/k8s.io/pull/7141
- [x] https://github.com/kubernetes/k8s.io/pull/7211/
- [x] https://github.com/kubernetes/test-infra/pull/33359
- [x] https://github.com/kubernetes/k8s.io/pull/7207
- [x] Turn down old monitoring stack
- [x] Transfer Kettle and TestGrid to use new logs buckets
- Not part of control plane migration, instead see #33381
- [x] https://github.com/kubernetes/k8s.io/pull/7231
- [ ] Turn down old Deck and old CRs
- [x] Delete remaining jobs on old Prow
- [ ] Ensure no repositories are registered with k8s-prow
- [ ] Delete Prow components in k8s-prow
- [ ] Handle logs buckets
- [ ] Add TTL and remove permissions for gs://kubernetes-jenkins
- [x] ~~Remove gs://kubernetes-jenkins-pull~~
- [ ] Set a lifecycle policy for the new prow bucket gs://kubernetes-ci-logs
- [ ] Remove deprecated code/references (for handling k8s-prow control plane)
- [ ] Delete the public IP address used to serve prow
- [x] Reconcile and merge https://github.com/kubernetes/k8s.io/pull/7205
Other resources may need to remain in the k8s-prow project (esp. images)