Signing flakes on krel release
What happened:
On a couple of the patch releases, we hit flakes with signing and needed to re-run the no mock release stages:
level=fatal msg="signing the blobs: signing the file /tmp.k8s/release-sign-blobs-1616671436/kubernetes-release/stage/v1.24.9-rc.0.42+15da17757dcdd3/v1.24.9/gcs-stage/v1.24.9/bin/windows/amd64/kubectl.exe: verifying signed file: /tmp.k8s/release-sign-blobs-1616671436/kubernetes-release/stage/v1.24.9-rc.0.42+15da17757dcdd3/v1.24.9/gcs-stage/v1.24.9/bin/windows/amd64/kubectl.exe: open /tmp.k8s/release-sign-blobs-1616671436/kubernetes-release/stage/v1.24.9-rc.0.42+15da17757dcdd3/v1.24.9/gcs-stage/v1.24.9/bin/windows/amd64/kubectl.exe.cert: no such file or directory"
level=fatal msg="signing the blobs: signing the file /tmp.k8s/release-sign-blobs-691105010/kubernetes-release-gcb/stage/v1.23.15-rc.0.32+06089cc90f824e/v1.23.15/gcs-stage/v1.23.15/bin/windows/arm64/kubectl-convert.exe: verifying signed file: /tmp.k8s/release-sign-blobs-691105010/kubernetes-release-gcb/stage/v1.23.15-rc.0.32+06089cc90f824e/v1.23.15/gcs-stage/v1.23.15/bin/windows/arm64/kubectl-convert.exe: open /tmp.k8s/release-sign-blobs-691105010/kubernetes-release-gcb/stage/v1.23.15-rc.0.32+06089cc90f824e/v1.23.15/gcs-stage/v1.23.15/bin/windows/arm64/kubectl-convert.exe.cert: no such file or directory"
level=fatal msg="signing the blobs: signing the file /tmp.k8s/release-sign-blobs-3048792261/kubernetes-release/stage/v1.22.17-rc.0.16+611514908b25d5/v1.22.17/gcs-stage/v1.22.17/bin/windows/amd64/kubelet.exe: verifying signed file: /tmp.k8s/release-sign-blobs-3048792261/kubernetes-release/stage/v1.22.17-rc.0.16+611514908b25d5/v1.22.17/gcs-stage/v1.22.17/bin/windows/amd64/kubelet.exe: open /tmp.k8s/release-sign-blobs-3048792261/kubernetes-release/stage/v1.22.17-rc.0.16+611514908b25d5/v1.22.17/gcs-stage/v1.22.17/bin/windows/amd64/kubelet.exe.cert: no such file or directory"
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
- Cloud provider or hardware configuration:
- OS (e.g:
cat /etc/os-release): - Kernel (e.g.
uname -a): - Others:
@cpanato this seems to be another race we hit when signing release artifacts. Do you want to give this a look? (maybe @puerco already did)
/remove-label priority/important-soon /priority critical-urgent
@xmudrii: The label(s) /remove-label priority/important-soon cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?
In response to this:
/remove-label priority/important-soon /priority critical-urgent
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@kubernetes/release-managers Carlos is out for a couple of days, do you we have any volunteer to support here?
First investigation: The certificate (.cert) file has to be written in cosign, after writing the signature:
https://github.com/sigstore/cosign/blob/d1c6336475b4be26bb7fb52d97f56ea0a1767f9f/cmd/cosign/cli/sign/sign_blob.go#L120-L129
It looks like that we never come to the point where the file has to be written, so I'm assuming that len(rekorBytes) == 0 :thinking:
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/assign
@jeremyrickard would it be possible to see more of the logs? I can't access them with the links above.
I am trying to create some context for myself to understand where in the process this happens. Is this the part triggered by krel release? If so where during that part are we doing a blob sign.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen