quota: disable quota check for ZVOL
Motivation and Context
I have created to separate PRs as I think the solution for ZVOL and dataset are quite different one and should be considered separately.
This is continuation of the https://github.com/openzfs/zfs/pull/12868. We based our assumption that the issue is a smoothing mechanism. This graph shows the dirty buff change during write:

As pointed out by Matt on our previous call the dirty buffer doesn’t hit the maximum while it suddenly stops. So we look if the existing smoothing mechanism is actually used, and it turns out that it isn't. So we look into what is causing this stop. And we found out that the issue is the quota mechanism. The dsl_dir_tempreserve_impl is causing a hard stop if the used_on_disk + est_inflight >= quota . This callcalation don’t take in account if the new data override old ones or not. If the data are just override, we still issuing a full stop.
Description
The quota for ZVOLs is set to the size of the volume. When the quota reaches the maximum, there isn't an excellent way to check if the new writers are overwriting the data or if they are inserting a new one. Because of that, when we reach the maximum quota, we wait till txg is flushed. This is causing a significant fluctuation in bandwidth.
In the case of ZVOL, the quota is enforced by the volsize, so we can omit it.
Sponsored-by: Zededa Inc. Sponsored-by: Klara Inc.
How Has This Been Tested?
We test this using fio tool. Without this test we can notice uneven bandwidth which finally drops to 0, until the TXG will be flushed. With this patch we see a nice stable bandwidth.
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Performance enhancement (non-breaking change which improves efficiency)
- [ ] Code cleanup (non-breaking change which makes code smaller or more readable)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
- [ ] Documentation (a change to man pages or other documentation)
Checklist:
- [x] My code follows the OpenZFS code style requirements.
- [x] I have updated the documentation accordingly.
- [x] I have read the contributing document.
- [ ] I have added tests to cover my changes.
- [ ] I have run the ZFS Test Suite with this change applied.
- [x] All commit messages are properly formatted and contain
Signed-off-by.
I agree that quota check does not make much sense for ZVOLs with reservation, where you should not be able to create ZVOL above the parent dataset quota or snapshot it above parent quota, or set parent quota too low, etc, while data writes should succeed (except may be some pathologically small volblocksize and large ashift?). But it make complete sense to me for sparse ZVOLs (both as parent or ZVOL property, not sure why quota property is ZFS_TYPE_FILESYSTEM only). The only argument I remember from one of the recent monthly OpenZFS calls where it was discussed is that some applications may not handle that quota error properly, but I consider it as their own problems, not ZFS.
Saying all that, I could possibly live with this change, but I don't like it. It breaks existing ZFS designs. Doesn't PR #13839 addresses it enough? Or may be we could somehow check for refreservation >= volsize to ignore the quota?
I agree that quota check does not make much sense for ZVOLs with reservation, where you should not be able to create ZVOL above the parent dataset quota or snapshot it above parent quota, or set parent quota too low, etc, while data writes should succeed (except may be some pathologically small volblocksize and large ashift?). But it make complete sense to me for sparse ZVOLs (both as parent or ZVOL property, not sure why quota property is ZFS_TYPE_FILESYSTEM only). The only argument I remember from one of the recent monthly OpenZFS calls where it was discussed is that some applications may not handle that quota error properly, but I consider it as their own problems, not ZFS.
if we made the check, when type=zvol, do the recursion thing to check the parent dataset for a quota, would that resolve your concern?
Saying all that, I could possibly live with this change, but I don't like it. It breaks existing ZFS designs. Doesn't PR #13839 addresses it enough? Or may be we could somehow check for refreservation >= volsize to ignore the quota?
Your concern here is about a thin-provisioned ZVOL being able to use too much space?
Based on conversations had at the OpenZFS Developer Summit, we have hidden the "don't check the quota if the dataset is a zvol" feature behind an off-by-default tunable, so that in the future if we add the ability to set a quota on a zvol (specifically other than the volsize) that it will be enforced.
Some of the performance is regained by the recently merged https://github.com/openzfs/zfs/commit/8af08a69cda63e6d7983fc2f32f9fed4155b95be that allows going slightly over the quota, but this change/tunable is still useful for very busy zvols to avoid getting throttled by the quota.
Before this change, you could set a quota on a filesystem, and it would ensure that you can't use (much) more space than the quota. From the zfsprops manpage:
quota=size|none
Limits the amount of space a dataset and its descendents can consume.
This property enforces a hard limit on the amount of space used. This
includes all space consumed by descendents ...
After this change, you can set a quota on a filesystem, but it might not actually limit the space used by the descendents, if there is a descendant sparse zvol. This is a significant change in existing behavior. It means that quotas can be circumvented.
I see that @amotin raised this concern here but I don't see a response. Was there additional discussion of this that I missed? I was surprised to see this change land and I'm concerned that others may also want to give input on a change of this significance.
I would suggest that this commit be reverted until this concern can be addressed.
Thanks @behlendorf for pointing out that this commit doesn't actually "disable quota check for ZVOL", it merely adds a tunable to allow for disabling the quota check. The default behavior is unchanged. So I don't think that this needs to be reverted.
No problem, the final commit message here was unfortunately a bit misleading. But that's right, this change doesn't change the default behavior, it only adds an optional tunable.