zfs icon indicating copy to clipboard operation
zfs copied to clipboard

zdb: cleanup and improve cachefile handling

Open ixhamza opened this issue 2 years ago • 4 comments

Motivation and Context

  • Add missing cleanup for early return.
  • Automatically detect cache file; otherwise, force import.

Description

  • When encountering an error or early return, kernel_fini is not called.
  • If a pool is created with the cache file located in a non-default path /etc/default/zpool.cache, removed, or the cachefile property is set to none, zdb fails to show the pool unless we specify the cache file or use the -e option. This PR automates this process.

How Has This Been Tested?

CI Tests and tested with the following five scenarios. Without this patch, only scenario 1 works without specifying additional arguments:

  1. Default cache file path /etc/zfs/zpool.cache, and the file exists.
  2. Default cache file path /etc/zfs/zpool.cache, and the file does not exist.
  3. Custom cache file path /data/zfs/zpool.cache, and the file exists.
  4. Custom cache file path /data/zfs/zpool.cache, and the file does not exist.
  5. Cachefile property set to none.

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Performance enhancement (non-breaking change which improves efficiency)
  • [x] Code cleanup (non-breaking change which makes code smaller or more readable)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • [ ] Documentation (a change to man pages or other documentation)

Checklist:

  • [x] My code follows the OpenZFS code style requirements.
  • [ ] I have updated the documentation accordingly.
  • [x] I have read the contributing document.
  • [ ] I have added tests to cover my changes.
  • [ ] I have run the ZFS Test Suite with this change applied.
  • [x] All commit messages are properly formatted and contain Signed-off-by.

ixhamza avatar Apr 08 '24 22:04 ixhamza

When cachefile property is set to none, do [-d] or [-O] options require the -e option? or is that automated as well?

akashb-22 avatar Apr 16 '24 16:04 akashb-22

When cachefile property is set to none, do [-d] or [-O] options require the -e option? or is that automated as well?

@akashb-22 - Considering libzfs is able to read the cachefile property correctly, it should be automated.

ixhamza avatar Apr 16 '24 17:04 ixhamza

I see some disparity when I use the pool-name and dataset name as target.

# zpool get cachefile
NAME       PROPERTY   VALUE      SOURCE
pool-oss5  cachefile  none       local

# zdb -d pool-oss5
Dataset mos [META], ID 0, cr_txg 4, 5.62M, 171 objects
Dataset pool-oss5/ost5 [ZPL], ID 387, cr_txg 67, 1.20G, 7 objects
Dataset pool-oss5 [ZPL], ID 54, cr_txg 1, 1024K, 6 objects
Verified large_blocks feature refcount of 1 is correct
Verified large_dnode feature refcount of 1 is correct
Verified sha512 feature refcount of 0 is correct
Verified skein feature refcount of 0 is correct
Verified edonr feature refcount of 0 is correct
Verified userobj_accounting feature refcount of 1 is correct
Verified encryption feature refcount of 0 is correct
Verified project_quota feature refcount of 1 is correct
Verified redaction_bookmarks feature refcount of 0 is correct
Verified redacted_datasets feature refcount of 0 is correct
Verified bookmark_written feature refcount of 0 is correct
Verified livelist feature refcount of 0 is correct
Verified zstd_compress feature refcount of 0 is correct
Verified zilsaxattr feature refcount of 1 is correct
Verified blake3 feature refcount of 0 is correct
Verified redaction_list_spill feature refcount of 0 is correct
Verified device_removal feature refcount of 0 is correct
Verified indirect_refcount feature refcount of 0 is correct

# zdb -d pool-oss5/ost5
failed to hold dataset 'pool-oss5/ost5': No such file or directory
zdb: can't open 'pool-oss5/ost5': No such file or directory

# zdb -d pool-oss5/ost5 -e
Dataset pool-oss5/ost5 [ZPL], ID 387, cr_txg 67, 1.20G, 7 objects

I think the dataset as the target doesn't go well here maybe, I'm not sure, the same with -e seems to be working fine. @ixhamza is this expected?

akashb-22 avatar Apr 16 '24 19:04 akashb-22

@akashb-22 - I have made some adjustments to automate the scenario you mentioned. Can you give it a try now?

ixhamza avatar Apr 16 '24 21:04 ixhamza

@behlendorf - I have updated the patch according to your suggestions and rebased it with master branch. The alloc_class tests, which were previously failing, are now passing. However, the remaining failing test cases seem to be unrelated.

ixhamza avatar May 29 '24 15:05 ixhamza