external-snapshotter icon indicating copy to clipboard operation
external-snapshotter copied to clipboard

v6.0.1 controller seems to break k8s e2e tests

Open mattcary opened this issue 3 years ago • 6 comments
trafficstars

See https://testgrid.k8s.io/sig-storage-kubernetes#volume-snapshot. The failures started after https://github.com/kubernetes/kubernetes/pull/110204 was merged, which updated the controller from v4 to v6.

Failures for the mock csi driver tests are due to the VolumeSnapshot not being ready, but the snapshot-controller logs show that the VolumeSnapshotContents are being marked as ready. Maybe something changed between v4 and v6 in the processing somehow?

I can continue to dig but if anyone else has any ideas on what's going on please chime in!

/assign /cc @humblec /cc @xing-yang

(@xing-yang it appears the RBAC problem you saw is a read herring, it's happening when things are shut down and there's a race between the controller getting killed and the rbac getting destroyed. The reason for the test failure seems to be this snapshot contents vs snapshot not ready problem)

mattcary avatar Jun 27 '22 18:06 mattcary

In v6, v1beta1 is no longer served. We've updated snapshot CRDs and snapshot-controller to v6. We need to update the sidecar images: https://github.com/kubernetes/kubernetes/tree/master/test/e2e/testing-manifests/storage-csi

xing-yang avatar Jun 27 '22 21:06 xing-yang

Ah... I was looking for error messages around that, but I guess the tests just aren't going to find the new resources.

mattcary avatar Jun 27 '22 21:06 mattcary

@mattcary @xing-yang shall I drop a PR to fix the same?

humblec avatar Jun 28 '22 04:06 humblec

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 26 '22 04:09 k8s-triage-robot

Here's an issue to track this: https://github.com/kubernetes/kubernetes/issues/112694

xing-yang avatar Sep 26 '22 13:09 xing-yang

/remove-lifecycle stale

xing-yang avatar Sep 26 '22 13:09 xing-yang

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Dec 25 '22 14:12 k8s-triage-robot

/remove-lifecycle stale

mattcary avatar Dec 28 '22 01:12 mattcary

/close

Tests are passing now

mattcary avatar Dec 28 '22 01:12 mattcary

@mattcary: Closing this issue.

In response to this:

/close

Tests are passing now

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 28 '22 01:12 k8s-ci-robot