zfs icon indicating copy to clipboard operation
zfs copied to clipboard

Allow opt-in of zvol blocks in special class

Open don-brady opened this issue 1 year ago • 15 comments

Motivation and Context

The zfs special_small_blocks property allows a dataset to opt-in to using the special vdev class for small block allocations. This property is currently considered valid for filesystems but invalid for volumes. This distinction is somewhat arbitrary for an opt-in policy. In some situations, a small volume (relative to the size of the special class) might benefit from using the special class.

Description

Allow setting special_small_blocks property for volumes. This is effectively a policy change and not tied to a feature flag (other than feature@allocation_classes).

Note: there are no guardrails in place when setting this property and it is possible to oversubscribe the special class with more blocks than could possibly fit. This is not fatal but might be less than ideal from a performance perspective.

How Has This Been Tested?

Manually tested adding a zvol with special_small_blocks set and confirmed allocations went to the special class. Added new ZTS test to confirm it works: functional/alloc_class/alloc_class_016_pos Also imported pool with special_small_blocks set on a zvol using an older zfs kernel module.

$ sudo dd of=/dev/zvol/hybrid/zvol if=/dev/urandom bs=1M count=50
50+0 records in
50+0 records out
52428800 bytes (52 MB, 50 MiB) copied, 2.86468 s, 18.3 MB/s
$ zpool list -v hybrid
NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
hybrid      298G  51.0M   298G        -         -     0%     0%  1.00x    ONLINE  -
  sda2      149G   278K   149G        -         -     0%  0.00%      -    ONLINE
special        -      -      -        -         -      -      -      -         -
  sda3      149G  50.8M   149G        -         -     0%  0.03%      -    ONLINE

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Performance enhancement (non-breaking change which improves efficiency)
  • [ ] Code cleanup (non-breaking change which makes code smaller or more readable)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • [ ] Documentation (a change to man pages or other documentation)

Checklist:

  • [x] My code follows the OpenZFS code style requirements.
  • [x] I have updated the documentation accordingly.
  • [x] I have read the contributing document.
  • [x] I have added tests to cover my changes.
  • [x] I have run the ZFS Test Suite with this change applied.
  • [x] All commit messages are properly formatted and contain Signed-off-by.

don-brady avatar May 17 '23 21:05 don-brady

I might wonder if we would want a tunable defaulted to 0 like zfs_user_indirect_is_special for controlling whether zvol data blocks can land on there, otherwise we might have situations where special_small_blocks is set at, say, pool root, and after this change, we'd have a change in zvol behavior.

(Really, I'd like a pool property for it, not a global, but I don't know how backward compatible those are...)

rincebrain avatar May 18 '23 21:05 rincebrain

I might wonder if we would want a tunable defaulted to 0 like zfs_user_indirect_is_special for controlling whether zvol data blocks can land on there, otherwise we might have situations where special_small_blocks is set at, say, pool root, and after this change, we'd have a change in zvol behavior.

@rincebrain Ah, interesting point! I somehow missed that special_small_blocks is an inheritable property. Perhaps we should have a volume specific property, like say, volspecialsize, so it is truly an opt-in property for zvols.

don-brady avatar May 18 '23 23:05 don-brady

I think a new property would be a good option too, given the very different constraints involved, though the semantics might be weird because people will be annoyed if it can't be inherited and they have to manually set it, but setting a noop property on filesystems just so it's inherited on volumes is also unfortunate...

rincebrain avatar May 19 '23 00:05 rincebrain

Does this also respect the metadata reserve % tunable?

beren12 avatar May 19 '23 00:05 beren12

Does this also respect the metadata reserve % tunable?

yes

don-brady avatar May 19 '23 00:05 don-brady

Note that the other volume-specific properties are not inheritable. If you want all your zvols to have the same volblocksize there is no way to inherit that value.

don-brady avatar May 19 '23 00:05 don-brady

If you want all your zvols to have the same volblocksize there is no way to inherit that value.

Zvols don't have children, so there is nothing to inherit. Maybe zvols can be exempted from inheriting this property, since they're special?

scineram avatar May 19 '23 06:05 scineram

Yes, zvols can't have children. But unlike, say, volblocksize or refreservation, where the values don't make sense to inherit and are zvol specific, whether you store it on the special vdev might make sense to inherit, but just reusing special_small_blocks could yield amusing outcomes.

Idea: A novel property on zvols, honor_special_for_data or some better name, which can be on/off, and listens to special_small_blocks for data records if it's on. Then you can inherit or override special_small_blocks for zvols, and also control whether it cares about the setting for data blocks. (You still get to do N sets to turn it on on all the zvols, but it's the best I've got at the moment.)

rincebrain avatar May 19 '23 07:05 rincebrain

what is the current status of this? will it be included in the next release?

tcpluess avatar Apr 29 '24 06:04 tcpluess

Are there any plans for this MR?

AlexeyMatskevich avatar May 17 '24 13:05 AlexeyMatskevich

@don-brady Can you may rebase this and ask the maintainers whats left to get it integrated? Much thanks in any case!

jumbi77 avatar Jun 26 '24 22:06 jumbi77

@don-brady Can you may rebase this and ask the maintainers whats left to get it integrated? Much thanks in any case!

Hello @don-brady, can I politely ask for rebasing? Looking forward to this feature. Much thanks in advance.

jumbi77 avatar Sep 27 '24 17:09 jumbi77