external-snapshotter
external-snapshotter copied to clipboard
VolumeSnapshot Storage Quota
Hello!
Is it possible today (or in the future) for ResourceQuotas to support storage snapshots based on size as opposed to object count? For my purpose I don't particularly find setting the ResourceQuota using object count for storage objects to be very useful (though I am sure it can be for some). The impact on a snapshot in terms of resources is storage space, so having a mechanism to control that space would be ideal. I understand there are probably some challenges with scoping of CRDs (VolumeSnapshotContents) and I am not here to offer a solution unfortunately, but perhaps there are some clever folks out there who may be able to?
Ideally any VolumeSnapshots have an impact on a ResourceQuotas requests.storage quota, possible by extending resource requests for storage into the CRD using the size of the PVC or underlying VolumeSnapshotContents. Much like PersistentVolumeClaims which have an impact on requests.storage in a quota, they themselves are mapped to other cluster resources (PersistentVolumes) so would the same be possible for VolumeSnapshots? For our use case, I can't see us allowing snapshots to be used in a cluster without having precise control of the storage consumption of different users. Thanks for hearing me out.
@ctml91 I looked at this long time ago, but didn't find a solution for it due to the challenges you've already discovered. Storage Quota is not designed to consider CRDs, so there isn't a way to support snapshot quota currently.
Thanks for your feedback @xing-yang !
Completely understand there might be some limitations in the design of quotas and/or volumesnapshots. Unfortunately it's above be level of understanding and I am surely oversimplifying, but is it not possible to extend the interface providing requests.storage from core to be consumed in a VolumeSnapshot that the Quota can pickup? Or is it not suitable for a VolumeSnapshot because it has specifically been designed for PersistentVolumeClaims and PV's and as a result not something external-snapshotter can implement without changes to core? Hopefully it can be a future consideration, if not in external-snapshotter then in core to make it a possibility.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
For the record, I talked to @deads2k and this would be the first step to get quota for nr. of VolumeSnapshots for each VolumeSnapshotClass in a namespace:
-
Implement a new in-tree resource, say
CustomResourceEvaluatorthat would provide a quota-able value for a CR (VolumeSnapshot) using CEL expressions. A quick and dirty CEL sketch:kind: CustomResourceEvaluator metadata: name: snapshotQuota spec: resources: - volumesnapshots.storage.k8s.io quotaFields: - name: cel: " 'volumeSnapshotClass.' + self.spec.volumeSnapshotClassName" - value: cel: "1" -
Implement a new quota Evaluator that, based on CustomQuotaEvaluator instances, would collect VolumeSnapshots and for each of them compute a new quota
volumeSnapshotClass.<name of a class>: 1, so users can have quota for nr. of snapshots in a namespace (sayvolumeSnapshotClass.foo: 10andvolumeSnapshotClass.bar: 20).
This does not solve capacity yet, because there is no capacity field in a snapshot and it's not possible to do cross-object quota (i.e. count PVC.status.capacity for a VolumeSnapshot). We would need to 1) add capacity to VolumeSnapshot and 2) ensure that VolumeSnapshot.spec.capacity is the same as PVC.status.capacity, so users can't cheat. Corresponding CEL sketch could then look like:
- name:
cel: " 'volumeSnapshotClassCapacity.' + self.spec.volumeSnapshotClassName"
- value:
cel: "self.spec.capacity"
And the new CustomQuotaEvaluator would calculate sum of all capacities per storage class. Quota would look like volumeSnapshotClassCapacity.foo: 10Gi and volumeSnapshotClassCapacity.bar: 20Gi.
However, the new Evaluator + CEL is a complex project to implement! I am just dumping my brain here, so the knowledge is not lost.
Another thing I was warned not to do: we should not copy quota code into snapshot-controller and webhook admission. It's a complex and heavily optimized code and the implementation should stay on a single place (k/k).
/reopen
@xing-yang: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.