gcp-compute-persistent-disk-csi-driver icon indicating copy to clipboard operation
gcp-compute-persistent-disk-csi-driver copied to clipboard

filesystem is not resized when restoring from snapshot/cloning to larger size than origin

Open RomanBednar opened this issue 2 years ago • 21 comments

What type of PR is this?

/kind bug

What this PR does / why we need it:

Volume filesystem size is not expanded during clone if pvc data source is larger than original volume. Similarly when restoring a snapshot to a pvc with larger size the filesystem is not expanded either.

This patch uses mount-utils to handle volume size check and resizing.

Which issue(s) this PR fixes:

Fixes #784

Special notes for your reviewer: Here is how I verified the patch manually.

BEFORE PATCH:

1.Create pvc with 1Gi/pod

2.Create clone pvc with 5Gi/pod

$ oc get pvc/clone-of-pvc-1 -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
    volume.kubernetes.io/selected-node: ci-ln-469g9qt-72292-bp8k2-worker-a-jmrcv
    volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
  creationTimestamp: "2022-04-11T06:30:29Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: clone-of-pvc-1
  namespace: test
  resourceVersion: "40853"
  uid: 4d439336-35b1-4836-8ece-2c21ce36d1b2
spec:
  accessModes:
  - ReadWriteOnce
  dataSource:
    apiGroup: null
    kind: PersistentVolumeClaim
    name: pvc2
  resources:
    requests:
      storage: 5Gi
  storageClassName: standard-csi
  volumeMode: Filesystem
  volumeName: pvc-4d439336-35b1-4836-8ece-2c21ce36d1b2
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 5Gi
  phase: Bound
  
3. Check PVCs

$ oc get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
clone-of-pvc-1   Bound    pvc-4d439336-35b1-4836-8ece-2c21ce36d1b2   5Gi        RWO            standard-csi   59m
pvc2             Bound    pvc-9508e898-d958-412b-b2c8-4523f87d7b25   1Gi        RWO            standard-csi   69m

4. The filesystem was not resized in the pod (5Gi expected)

oc exec mypodclone -t -i -- df -h /tmp
Filesystem                Size      Used Available Use% Mounted on
/dev/sdc                975.9M      2.5M    957.4M   0% /tmp

AFTER PATCH:

1. Create pvc with 1Gi/pod

2. Create snapshot of the pvc

3. Create clone of the pvc with 5Gi/pod

4. Restore the snapshot to a new pvc with 5Gi/pod

5. Create a pod that mounts both clone and restored snapshot pvc

6. Check that the filesystem was resized correctly in the pod

$ oc get pvc
NAME                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-1                    Bound    pvc-71b5bb22-2ca7-4150-af44-b5b0b48111a7   1Gi        RWO            standard-csi   15m
pvc-1-clone              Bound    pvc-e6ebb004-baba-446b-b5f5-90bac00240c9   5Gi        RWO            standard-csi   14m
pvc-1-snapshot-restore   Bound    pvc-6d325c59-c6e1-428b-b83d-958c78c71964   5Gi        RWO            standard-csi   13m

$ oc exec task-pv-pod-2 -t -i -- df -h | grep -E "clone|restore"
/dev/sdd        4.9G  4.0M  4.9G   1% /tmp/clone
/dev/sdc        4.9G  4.0M  4.9G   1% /tmp/restore

Does this PR introduce a user-facing change?:

Filesystem is now expanded correctly when restoring from a snapshot or cloning a volume with larger size than the origin volume.

RomanBednar avatar May 03 '22 08:05 RomanBednar

Welcome @RomanBednar!

It looks like this is your first PR to kubernetes-sigs/gcp-compute-persistent-disk-csi-driver 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gcp-compute-persistent-disk-csi-driver has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar May 03 '22 08:05 k8s-ci-robot

Hi @RomanBednar. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 03 '22 08:05 k8s-ci-robot

@mattcary Hi, I've noticed there are tests for NodeExpandVolume that are commented out with a note about fakeexec being too brittle:

https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/00986499d2d4e8453cc85b2bee2fede5600dcc97/pkg/gce-pd-csi-driver/node_test.go#L454

But, by adding mount-utils resizer to NodeStageVolume I broke the existing unit tests for this function because there are actual command executions in the resize flow and so I used the fakeexec - just to have some working example. Better approach might be using a mock interface instead. What do you think?

RomanBednar avatar May 03 '22 09:05 RomanBednar

/ok-to-test

mattcary avatar May 03 '22 15:05 mattcary

I dunno, creating a mock that would actually catch errors or regressions is very hard. Maybe just stick with fakeexec for now and we'll see how it does.

This should be exercised in e2e tests as it's part of the csi spec (but probably isn't?)

mattcary avatar May 03 '22 15:05 mattcary

I dunno, creating a mock that would actually catch errors or regressions is very hard. Maybe just stick with fakeexec for now and we'll see how it does.

This should be exercised in e2e tests as it's part of the csi spec (but probably isn't?)

We've investigated the tests with @jsafrane and we currently see two issues:

  1. GCP driver sanity tests use fakes (mounter+cloud provider) This causes the resize code to fail on command executions because it can't read real size of the device. However, fakeexec with predefined command sequence (like I used in unit tests) is probably not an option for sanity tests because it shares the fake mounter across many other tests.

    https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/master/test/sanity/sanity_test.go#L62

  2. CSI e2e tests currently don't excercise resize code for NodeStageVolume Update of the tests is planned in scope of the snaphot restore/resize effort. It is mentioned here as the last item on the checklist: https://github.com/kubernetes/kubernetes/issues/94929

I'll look into it and update the CSI tests accordingly once we resolve the sanity test issue. Another catch is that adding csi e2e tests will take quite some time before we can use it (until next Kubernetes release at best). @mattcary What would you suggest to fix the sanity tests? And do we want to wait with this patch until e2e tests are available?

RomanBednar avatar May 10 '22 10:05 RomanBednar

Thanks for laying out the state of things so clearly!

Would it be easy to add a gcp-persistent-disk-csi-driver/test/e2e test? (confusingly this is not the k/k e2e test)

It should be about as straightfoward as adding a sanity check, except it uses a real cloud provider etc so no faking is necessary.

mattcary avatar May 10 '22 16:05 mattcary

@mattcary, while it would be possible to add a new test to test/e2e, it's not easy to fix e2e/sanity - all NodeStage calls there will fail, because the fake mounter / exec does not return a valid device size when asked. To me it looks like fixing that would take a huge amount of work with a doubtful result.

Out of curiosity, is there any reason why don't you run pull-gcp-compute-persistent-disk-csi-driver-sanity tests with a real cloud + mounter + exec? In addition, running k/k tests would be beneficial too, you are able to install the driver and run the tests in k/k CI, why not here?

jsafrane avatar May 11 '22 10:05 jsafrane

@mattcary, while it would be possible to add a new test to test/e2e, it's not easy to fix e2e/sanity - all NodeStage calls there will fail, because the fake mounter / exec does not return a valid device size when asked. To me it looks like fixing that would take a huge amount of work with a doubtful result.

I agree. Sorry for not making that more clear---I don't think we should try to fix sanity, I think we should just put it in an e2e test. That would be skipping resize tests in sanity (the other tests will continue to work, or have those been broken too?)

I'm not actually familiar with the sanity tests, I've never had a reason to look at them. If we do run them in a cloud context, would they substantially duplicate the e2e tests? I'll have to look into them. I suppose our e2e tests hit GCE specific features that aren't covered by sanity, and sanity does more thorough generic CSI testing. If my supposition is true I think it would probably be a good idea to run them with real resources. That's been reliable in the e2e tests, and I've never found fakes a very convincing test strategy.

Out of curiosity, is there any reason why don't you run pull-gcp-compute-persistent-disk-csi-driver-sanity tests with a real cloud + mounter + exec? In addition, running k/k tests would be beneficial too, you are able to install the driver and run the tests in k/k CI, why not here?

test/k8s-integration runs the k/k tests. There's real value in splitting those out from e2e tests as they take a lot longer to run and are way way flakier. Just bringing up a cluster consistently across k8s minor versions, different versions of kubetest, etc, is a lot of toil.

mattcary avatar May 11 '22 16:05 mattcary

That would be skipping resize tests in sanity (the other tests will continue to work, or have those been broken too?)

Unfortunately, all sanity tests that call NodeStage will fail, because NodeStage is not able to read device size with the fake exec.

If we do run them in a cloud context, would they substantially duplicate the e2e tests?

There definitely is some overlap in your custom e2e tests, kubernetes-integration and sanity test. The sanity tests add some checks that cannot be easily tested in Kubernetes, like that the driver returns the right error code when a volume does not exist, staging path does not exist and so on. If you want to use sanity tests, then you should run them against a real CSI driver in a real cloud. On the other hand, you don't need Kubernetes to run the tests (unless the driver itself needs to talk to Kubernetes). You need a VM in GCE, run the CSI driver in the VM (either in a container or directly on the host, it does not really matter) and poke the driver's CSI socket with the sanity tests. It will provision real volumes, attach them to the VM and mount them.

jsafrane avatar May 12 '22 12:05 jsafrane

See #991 and https://github.com/kubernetes/test-infra/pull/26354, we're making the sanity tests optional while we sort this out.

I think that unblocks this PR? (as soon as the test-infra change goes in)

mattcary avatar May 18 '22 16:05 mattcary

/refresh /retest

jsafrane avatar May 23 '22 11:05 jsafrane

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 13 '22 13:09 k8s-triage-robot

/remove-lifecycle stale

mattcary avatar Sep 13 '22 16:09 mattcary

I've noticed that mount-utils does not support Resize() on platforms other than Linux. And later NeedResize() was added as well which is reasonable because having it without Resize() would not make sense.

So we can fix the failing Windows build just by bumping mount-utils and it's dependencies.

@mattcary what do you think?

(I've seen some new CI failures after version bump due to mounter interface changes, trying to troubleshoot now)

RomanBednar avatar Sep 30 '22 16:09 RomanBednar

@RomanBednar: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gcp-compute-persistent-disk-csi-driver-sanity c244d1fe2ceb144234983ca0586a613697a3be70 link true /test pull-gcp-compute-persistent-disk-csi-driver-sanity

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Sep 30 '22 16:09 k8s-ci-robot

@mauriciopoppe and @jingxu97 for the windows questions.

/cc @mauriciopoppe /cc @jingxu97

mattcary avatar Sep 30 '22 16:09 mattcary

On the node side NodeExpandVolume supports Windows through CSI Proxy (resize code) and doesn't use mount-utils, if there's a fix on Windows it should go on that project instead.

This area is in active development, we're making changes to CSI Proxy so that it's just a library like mount-utils, see #1057

mauriciopoppe avatar Sep 30 '22 17:09 mauriciopoppe

On the node side NodeExpandVolume supports Windows through CSI Proxy (resize code) and doesn't use mount-utils, if there's a fix on Windows it should go on that project instead.

I see, thanks for the explanation. So if I understand it correctly it means we should not try to use mount-utils directly in driver code but go through the proxy. This applies to both Windows and Linux code which should implement the same Resizefs interface.

I've changed the patch now and tested that the resizing works fine on Linux with both clone and snapshot restored to larger PVC size:

  1. Origin, clone and snapshot PVCs
$ oc get pvc 
NAME                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc-1                    Bound    pvc-b7865cd4-9f4c-4300-bb7d-99292d552d2c   1Gi        RWO            standard-csi   14m
pvc-1-clone              Bound    pvc-6bfd4b6f-8b28-40a8-ae7e-4f12cfe17fce   5Gi        RWO            standard-csi   3m11s
pvc-1-snapshot-restore   Bound    pvc-2cf1ddb2-002d-4a87-9475-d1b25a70907c   5Gi        RWO            standard-csi   2m57s
  1. Filesystem resized correctly after creating a pod that mounts the clone pvc:
$ oc exec example-clone -t -i -- df -h | grep -E /usr/share/
/dev/sdc        4.9G  4.0M  4.9G   1% /usr/share/nginx/html
  1. And same for PVC restored from snapshot:
$ oc exec example-snapshot -t -i -- df -h | grep -E /usr/share/
/dev/sdd        4.9G  4.0M  4.9G   1% /usr/share/nginx/html

@mauriciopoppe @mattcary @jsafrane what do you think about this approach?

RomanBednar avatar Oct 05 '22 13:10 RomanBednar

This looks right to me from the linux side, but let's let @mauriciopoppe comment

mattcary avatar Oct 05 '22 19:10 mattcary

Thanks for the review @mauriciopoppe !

/lgtm /approve

(and of course thanks for the PR @RomanBednar :-)

mattcary avatar Oct 25 '22 15:10 mattcary

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattcary, RomanBednar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Oct 25 '22 15:10 k8s-ci-robot