external-snapshotter v6.0.1 controller seems to break k8s e2e tests

See https://testgrid.k8s.io/sig-storage-kubernetes#volume-snapshot. The failures started after https://github.com/kubernetes/kubernetes/pull/110204 was merged, which updated the controller from v4 to v6.

Failures for the mock csi driver tests are due to the VolumeSnapshot not being ready, but the snapshot-controller logs show that the VolumeSnapshotContents are being marked as ready. Maybe something changed between v4 and v6 in the processing somehow?

I can continue to dig but if anyone else has any ideas on what's going on please chime in!

/assign /cc @humblec /cc @xing-yang

(@xing-yang it appears the RBAC problem you saw is a read herring, it's happening when things are shut down and there's a race between the controller getting killed and the rbac getting destroyed. The reason for the test failure seems to be this snapshot contents vs snapshot not ready problem)

Jun 27 '22 18:06 mattcary

In v6, v1beta1 is no longer served. We've updated snapshot CRDs and snapshot-controller to v6. We need to update the sidecar images: https://github.com/kubernetes/kubernetes/tree/master/test/e2e/testing-manifests/storage-csi

Jun 27 '22 21:06 xing-yang

Ah... I was looking for error messages around that, but I guess the tests just aren't going to find the new resources.

Jun 27 '22 21:06 mattcary

@mattcary @xing-yang shall I drop a PR to fix the same?

Jun 28 '22 04:06 humblec

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Sep 26 '22 04:09 k8s-triage-robot

Here's an issue to track this: https://github.com/kubernetes/kubernetes/issues/112694

Sep 26 '22 13:09 xing-yang

/remove-lifecycle stale

Sep 26 '22 13:09 xing-yang

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Dec 25 '22 14:12 k8s-triage-robot

/remove-lifecycle stale

Dec 28 '22 01:12 mattcary

/close

Tests are passing now

Dec 28 '22 01:12 mattcary

@mattcary: Closing this issue.

In response to this:

/close

Tests are passing now

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Dec 28 '22 01:12 k8s-ci-robot

external-snapshotter external-snapshotter copied to clipboard

v6.0.1 controller seems to break k8s e2e tests

external-snapshotter
external-snapshotter copied to clipboard