compsize
compsize copied to clipboard
SEARCH_V2 buffer is too big, slows compsize down
I found that compsize was slower than btrfs fi du, which shouldn't happen because btrfs fi du uses FIEMAP while compsize uses TREE_SEARCH_V2.
# time btrfs fi du -s .
Total Exclusive Set shared Filename
6.92GiB 351.20MiB 4.67GiB .
real 0m2.414s
user 0m0.177s
sys 0m2.189s
# time compsize .
Processed 96083 files, 170199 regular extents (202723 refs), 39927 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 52% 3.6G 6.9G 7.0G
none 100% 2.2G 2.2G 2.0G
zlib 32% 1.3G 4.1G 4.5G
zstd 14% 75M 505M 478M
real 0m7.719s
user 0m0.105s
sys 0m7.532s
Some stracing found this:
ioctl(6, BTRFS_IOC_TREE_SEARCH_V2, {key={tree_id=0, min_objectid=17466447, max_objectid=17466447, ..., buf_size=16777216} => {key={nr_items=170},
This is zeroing out a 16 MB memory buffer to read one inode, and it slows compsize down to the point where even glacial FIEMAP calls can catch up.
Changing the buffer size to 64K makes it run much faster:
# time compsize-64k .
Processed 96083 files, 170199 regular extents (202723 refs), 39927 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 52% 3.6G 6.9G 7.0G
none 100% 2.2G 2.2G 2.0G
zlib 32% 1.3G 4.1G 4.5G
zstd 14% 75M 505M 478M
real 0m1.089s
user 0m0.102s
sys 0m0.950s
but the loop retry condition in do_file has to be fixed. I would suggest setting max_object = min_objectid + 1, then bail out of the loop as soon as you see an objectid > the inode you're looking at, or zero nr_items on ioctl return. Then it runs even faster:
# time compsize-64k-plus-one .
Processed 96083 files, 170199 regular extents (202723 refs), 39927 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 52% 3.6G 6.9G 7.0G
none 100% 2.2G 2.2G 2.0G
zlib 32% 1.3G 4.1G 4.5G
zstd 14% 75M 505M 478M
real 0m0.933s
user 0m0.086s
sys 0m0.800s
64K is the largest possible metadata page size. TREE_SEARCH_V2 will never need space for an item that large, so you don't have to worry about an item being too large for the buffer.
The first part, reducing buffer size, works even far more impressively on my test filesystem. That's pure awesomesity!
On the other hand, using max_object beyond the st_ino, makes it slower by ~11%, on the same filesystem.
Branch with both changes applied: https://github.com/kilobyte/compsize/tree/loop_retry
On the other hand, using max_object beyond the st_ino, makes it slower by ~11%, on the same filesystem.
That's interesting. Any idea why?
Hmmm...maybe the other max_ fields should be set differently. It will be fetching two inodes' complete extent maps. Maybe set max_type to BTRFS_INODE_ITEM, so it stops when it reaches the next inode, but doesn't dump out its extent list?
That would be:
min = (st_ino, BTRFS_EXTENT_DATA_KEY, 0), max = (st_ino + 1, BTRFS_INODE_ITEM_KEY, 0)
That should make the search stop as soon as it finds the next inode, without including the inode's extents (or directory entries, if the next inode is a directory). At the end of the subvol it will have to do another 0-item loop, but usually there are a lot more inodes than subvols.
This helped, but is still slower than continuing if previous run had >512.
Guys this is super cool! compsize run on one sample directory on my system is just 0.15s with loop_retry branch while it takes 5.5s with version from package manager. loop_retry branch allowed me to calculate which directories are actually take space!
Why is this not merged already? I suspect you'd like to get it perfect before merging. Is it possible to make release with main improvement and ponder about fine tuning afterwards?
@Self-Perfection: what version are you comparing with? 1,5 should have the buffer size speedup.
If the alternate retry condition makes such a large difference that's good to hear — but, I suspect your distro release has a version older than 1.5. Care to check?
@kilobyte you are right. My distro has older compsize version. Thank you for pointing this out.