release icon indicating copy to clipboard operation
release copied to clipboard

Signing flakes on krel release

Open jeremyrickard opened this issue 3 years ago • 13 comments

What happened:

On a couple of the patch releases, we hit flakes with signing and needed to re-run the no mock release stages:

Signing Flake for 1.24.9

level=fatal msg="signing the blobs: signing the file /tmp.k8s/release-sign-blobs-1616671436/kubernetes-release/stage/v1.24.9-rc.0.42+15da17757dcdd3/v1.24.9/gcs-stage/v1.24.9/bin/windows/amd64/kubectl.exe: verifying signed file: /tmp.k8s/release-sign-blobs-1616671436/kubernetes-release/stage/v1.24.9-rc.0.42+15da17757dcdd3/v1.24.9/gcs-stage/v1.24.9/bin/windows/amd64/kubectl.exe: open /tmp.k8s/release-sign-blobs-1616671436/kubernetes-release/stage/v1.24.9-rc.0.42+15da17757dcdd3/v1.24.9/gcs-stage/v1.24.9/bin/windows/amd64/kubectl.exe.cert: no such file or directory"

Signing Flake on 1.23.15

level=fatal msg="signing the blobs: signing the file /tmp.k8s/release-sign-blobs-691105010/kubernetes-release-gcb/stage/v1.23.15-rc.0.32+06089cc90f824e/v1.23.15/gcs-stage/v1.23.15/bin/windows/arm64/kubectl-convert.exe: verifying signed file: /tmp.k8s/release-sign-blobs-691105010/kubernetes-release-gcb/stage/v1.23.15-rc.0.32+06089cc90f824e/v1.23.15/gcs-stage/v1.23.15/bin/windows/arm64/kubectl-convert.exe: open /tmp.k8s/release-sign-blobs-691105010/kubernetes-release-gcb/stage/v1.23.15-rc.0.32+06089cc90f824e/v1.23.15/gcs-stage/v1.23.15/bin/windows/arm64/kubectl-convert.exe.cert: no such file or directory"

Signing Flake on 1.22.17

level=fatal msg="signing the blobs: signing the file /tmp.k8s/release-sign-blobs-3048792261/kubernetes-release/stage/v1.22.17-rc.0.16+611514908b25d5/v1.22.17/gcs-stage/v1.22.17/bin/windows/amd64/kubelet.exe: verifying signed file: /tmp.k8s/release-sign-blobs-3048792261/kubernetes-release/stage/v1.22.17-rc.0.16+611514908b25d5/v1.22.17/gcs-stage/v1.22.17/bin/windows/amd64/kubelet.exe: open /tmp.k8s/release-sign-blobs-3048792261/kubernetes-release/stage/v1.22.17-rc.0.16+611514908b25d5/v1.22.17/gcs-stage/v1.22.17/bin/windows/amd64/kubelet.exe.cert: no such file or directory"

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Others:

jeremyrickard avatar Dec 09 '22 03:12 jeremyrickard

@cpanato this seems to be another race we hit when signing release artifacts. Do you want to give this a look? (maybe @puerco already did)

saschagrunert avatar Dec 09 '22 08:12 saschagrunert

/remove-label priority/important-soon /priority critical-urgent

xmudrii avatar Jan 24 '23 15:01 xmudrii

@xmudrii: The label(s) /remove-label priority/important-soon cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/remove-label priority/important-soon /priority critical-urgent

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 24 '23 15:01 k8s-ci-robot

@kubernetes/release-managers Carlos is out for a couple of days, do you we have any volunteer to support here?

saschagrunert avatar Jan 25 '23 07:01 saschagrunert

First investigation: The certificate (.cert) file has to be written in cosign, after writing the signature: https://github.com/sigstore/cosign/blob/d1c6336475b4be26bb7fb52d97f56ea0a1767f9f/cmd/cosign/cli/sign/sign_blob.go#L120-L129

It looks like that we never come to the point where the file has to be written, so I'm assuming that len(rekorBytes) == 0 :thinking:

saschagrunert avatar Jan 25 '23 08:01 saschagrunert

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 25 '23 08:04 k8s-triage-robot

/remove-lifecycle stale

xmudrii avatar Apr 25 '23 08:04 xmudrii

/assign

matglas avatar Apr 29 '23 20:04 matglas

@jeremyrickard would it be possible to see more of the logs? I can't access them with the links above.

I am trying to create some context for myself to understand where in the process this happens. Is this the part triggered by krel release? If so where during that part are we doing a blob sign.

matglas avatar May 08 '23 06:05 matglas

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 19 '24 20:01 k8s-triage-robot

/remove-lifecycle stale

xmudrii avatar Jan 19 '24 20:01 xmudrii

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 18 '24 21:04 k8s-triage-robot

/lifecycle frozen

xmudrii avatar Apr 19 '24 09:04 xmudrii