zfs-localpv icon indicating copy to clipboard operation
zfs-localpv copied to clipboard

feat(zfs-localpv): add option for choosing between refquota and quota

Open cinapm opened this issue 1 year ago • 16 comments

Why is this PR required? What issue does it fix?: resolve this issue: https://github.com/openebs/zfs-localpv/issues/423

What this PR does?: this PR adds an option for choosing between refquota or quota and uses this value as zfs command options

Checklist:

  • [X] Fixes #
  • [x] PR Title follows the convention of <type>(<scope>): <subject>
  • [ ] Has the change log section been updated?
  • [ ] Commit has unit tests
  • [ ] Commit has integration tests
  • [ ] (Optional) Are upgrade changes included in this PR? If not, mention the issue/PR to track:
  • [ ] (Optional) If documentation changes are required, which issue on https://github.com/openebs/openebs-docs is used to track them:

cinapm avatar Jun 13 '24 10:06 cinapm

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 96.37%. Comparing base (e3d55d7) to head (ae2b7c2). Report is 2 commits behind head on develop.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #542   +/-   ##
========================================
  Coverage    96.37%   96.37%           
========================================
  Files            1        1           
  Lines          496      496           
========================================
  Hits           478      478           
  Misses          14       14           
  Partials         4        4           
Flag Coverage Δ
bddtests 96.37% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jun 14 '24 13:06 codecov-commenter

Thanks @cinapm. Looks good to me. I will let other folks review this. We will target this for this upcoming release.

Abhinandan-Purkait avatar Jun 20 '24 11:06 Abhinandan-Purkait

Have we tested this change by trying to apply a quota, and then a refquota underneath that dataset? Add a simple test as well perhaps.

dsharma-dc avatar Jun 20 '24 16:06 dsharma-dc

@cinapm Can you please let us know your plans on this PR? I would request you to consider the suggestion made by @niladrih here.

Abhinandan-Purkait avatar Jul 09 '24 05:07 Abhinandan-Purkait

This configuration option should be toggle-able through a storageclass.

i have added option to add quotatype parameter on storageclass, the default value is quota

cinapm avatar Jul 11 '24 16:07 cinapm

@cinapm Can you please let us know your plans on this PR? I would request you to consider the suggestion made by @niladrih here.

instead of using config for local-pv, I have changed the controller to accept and use quotatype parameter on storageclass

cinapm avatar Jul 11 '24 16:07 cinapm

@cinapm Thanks, we will review this. The team needs to decide on this because this changes the CRDs and changing anything on the CRDs would need a CRD version bump which now raises the question of upgrades and migration. cc @niladrih @avishnu @tiagolobocastro

Abhinandan-Purkait avatar Jul 15 '24 14:07 Abhinandan-Purkait

I would strongly suggest to verify the impact on the thick provisioning side of things : with a thinprovision: "no" the property reservation is set, it would probably be mandatory to switch to refreservation at the same time that refquota is used.

Let me explain the potential problem with the typical usage of snapshots :

  • snapshots are considered in quota but are not when using refquota
  • snapshots are considered in reservation but are not when using refreservation

You could end up with a non functionnal thick provisioning while mixing refquota with reservation

Let me take an exemple with a workload that requires 10GB of free space, with thick provisioning :

  • a PersitentVolume of 10GB
  • refquota and reservation being set (through thinprovision: "no")
  • with refquota, you can create an unlimited amount of GB of snapshots
  • with reservation, snapshots are considered as consuming space
  • this would mean that with 3GB of space being used and 9GB of snapshots, there would be 7GB of free space from a zfs FS standpoint, and that reservation would be exceeded by 2GB
  • so 2GB would be thin provisioned whereas the configuration is set to thick provisioning

Using refreservation while setting refquota and thinprovision: "no" would solve the issue.

abuisine avatar Jul 15 '24 16:07 abuisine

Also, I would like to confirm the huge value of this PR : without refquota and refreservation the usage of snapshots becomes almost impossible as they are deduced from free space. From an operational standpoint, it basically means that backups can impact production. We never used snapshots with this CSI for this reason.

This PR (with refreservation) would enable safe usage of snapshots in production.

abuisine avatar Jul 15 '24 16:07 abuisine

I would strongly suggest to verify the impact on the thick provisioning side of things : with a thinprovision: "no" the property reservation is set, it would probably be mandatory to switch to refreservation at the same time that refquota is used.

Let me explain the potential problem with the typical usage of snapshots :

  • snapshots are considered in quota but are not when using refquota
  • snapshots are considered in reservation but are not when using refreservation

You could end up with a non functionnal thick provisioning while mixing refquota with reservation

Let me take an exemple with a workload that requires 10GB of free space, with thick provisioning :

  • a PersitentVolume of 10GB
  • refquota and reservation being set (through thinprovision: "no")
  • with refquota, you can create an unlimited amount of GB of snapshots
  • with reservation, snapshots are considered as consuming space
  • this would mean that with 3GB of space being used and 9GB of snapshots, there would be 7GB of free space from a zfs FS standpoint, and that reservation would be exceeded by 2GB
  • so 2GB would be thin provisioned whereas the configuration is set to thick provisioning

Using refreservation while setting refquota and thinprovision: "no" would solve the issue.

Your point is very valid, and based on the points you mentioned, I have made the changes. now when there is option thinprovision: "no" , the reservation will be set for quota and refreservation will be set for refquota

cinapm avatar Jul 15 '24 22:07 cinapm

make manifests @Abhinandan-Purkait

i didn't change them manually, first of all I changed v1/zfsvolume.go and after that I have used following commands: make bootstrap make manifests

the make manifests command has changed zfs-restore and zfs-snapshot crds automatically based on changes in v1/zfsvolume.go

cinapm avatar Jul 18 '24 20:07 cinapm

make manifests @Abhinandan-Purkait

i didn't change them manually, first of all I changed v1/zfsvolume.go and after that I have used following commands: make bootstrap make manifests

the make manifests command has changed zfs-restore and zfs-snapshot crds automatically based on changes in v1/zfsvolume.go

Okay, I see the embedded volSpec in the restores and snapshot structures. Okay. It would help if you rebased your branch, there were some changes merged on the deepcopy methods. Also please run make kubegen although it won't change anything in your case, but it's worth checking.

You also need to copy the crd changes into the helm chart crd templates, as that's how crds get installed. Check https://github.com/openebs/zfs-localpv/tree/develop/deploy/helm/charts/charts/crds/templates

The other changes look good to me. It might be worth testing upgrade once before we take this in. Example, installing the previous version of zfs and upgrading it to this version and see the crd changes. Thanks

Abhinandan-Purkait avatar Jul 19 '24 05:07 Abhinandan-Purkait

make manifests @Abhinandan-Purkait

i didn't change them manually, first of all I changed v1/zfsvolume.go and after that I have used following commands: make bootstrap make manifests the make manifests command has changed zfs-restore and zfs-snapshot crds automatically based on changes in v1/zfsvolume.go

Okay, I see the embedded volSpec in the restores and snapshot structures. Okay. It would help if you rebased your branch, there were some changes merged on the deepcopy methods. Also please run make kubegen although it won't change anything in your case, but it's worth checking.

You also need to copy the crd changes into the helm chart crd templates, as that's how crds get installed. Check https://github.com/openebs/zfs-localpv/tree/develop/deploy/helm/charts/charts/crds/templates

The other changes look good to me. It might be worth testing upgrade once before we take this in. Example, installing the previous version of zfs and upgrading it to this version and see the crd changes. Thanks

@Abhinandan-Purkait

as you said, first of all, I have rebased my branch and after that I added crds changes into the corresponding helm manifests

cinapm avatar Jul 19 '24 15:07 cinapm

@cinapm It will be good if you add the test case to check the refquota and refreservation Something on this lines is alredy present here https://github.com/openebs/zfs-localpv/blob/2f938970d5bb03322e0fbdc779453c639ef6545c/tests/utils.go#L131

You can find the details about this is this issue #560 The conclusion is here https://github.com/openebs/zfs-localpv/issues/560#issuecomment-2232535073

sinhaashish avatar Jul 31 '24 06:07 sinhaashish

@cinapm Please rebase and run make manifests a lot of changes went in on the testing and ci side of things. Thanks

Abhinandan-Purkait avatar Aug 01 '24 06:08 Abhinandan-Purkait

@cinapm I see there isn't any activity here for a while, is there any blocker that you are facing that you need help with?

Abhinandan-Purkait avatar Aug 22 '24 06:08 Abhinandan-Purkait

@cinapm I see there isn't any activity here for a while, is there any blocker that you are facing that you need help with?

I have rebased the develop branch and also run make manifests

cinapm avatar Sep 06 '24 16:09 cinapm

@cinapm Please rebase and run make manifests a lot of changes went in on the testing and ci side of things. Thanks

@Abhinandan-Purkait I have resolved them

cinapm avatar Sep 06 '24 16:09 cinapm

@niladrih @tiagolobocastro @sinhaashish PTAL

Abhinandan-Purkait avatar Sep 09 '24 07:09 Abhinandan-Purkait

I am approving it , but better add the test here here

sinhaashish avatar Sep 10 '24 03:09 sinhaashish

@cinapm Just making sure there’s nothing pending on your side. If everything looks good, we can go ahead and merge it

sinhaashish avatar Sep 17 '24 06:09 sinhaashish

@cinapm Just making sure there’s nothing pending on your side. If everything looks good, we can go ahead and merge it

@sinhaashish
there is nothing on my side, everything is ok

cinapm avatar Sep 18 '24 06:09 cinapm

@cinapm Thanks. Merging the PR.

Abhinandan-Purkait avatar Sep 21 '24 16:09 Abhinandan-Purkait