compsize
compsize copied to clipboard
Exclusive / shared usage
Would it be possible to add a bit of code to show exclusive usage for subvolume? The only way to do that that I've found on the Internet seems to be to enable quota and then use btrfs qgroup show
. However, when I enable quota btrfs becomes unusable for a long time (at least 15 minutes) after snapshot creation, so I can't afford to do that.
Right now I have something like this:
# btrfs subvolume list --sort=rootid -t /data/pg_data
ID gen top level path
-- --- --------- ----
258 14791 5 mirrors/prod-db-c01_5432
328 14791 5 snapshots/bckash_6432
# btrfs-compsize /data/pg_data/mirrors/prod-db-c01_5432
Processed 14759 files, 18988196 regular extents (22223973 refs), 27 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 12% 203G 1.5T 1.5T
none 100% 11G 11G 11G
zstd 12% 192G 1.5T 1.5T
# btrfs-compsize /data/pg_data/snapshots/bckash_6432
Processed 6022 files, 10202853 regular extents (12257581 refs), 21 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 10% 95G 905G 879G
none 100% 55M 55M 55M
zstd 10% 95G 905G 879G
snapshots/bckash_6432
was created as a snapshot of mirrors/prod-db-c01_5432
. Then a lot of files in the snapshot were deleted and a few were changed. So I'd like to know how much its exclusive usage is. If I call btrfs-compsize
on both directories I get this:
# btrfs-compsize /data/pg_data/mirrors/prod-db-c01_5432 /data/pg_data/snapshots/bckash_6432
Processed 20781 files, 19001055 regular extents (34492529 refs), 48 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 12% 203G 1.5T 2.3T
none 100% 11G 11G 11G
zstd 12% 192G 1.5T 2.3T
I don't see if it's possible to calculate the exclusive usage for snapshots/bckash_6432
from these numbers.
So would it be possible to add a command line flag which would calculate usage only for files which are not reflinked? Then I could call btrfs-compsize
without that flag on the mirror
subvolume and after that call it with the flag on the snapshot
subvolume.
Any other method would be fine as well, this just looks as the simplest. But I don't quite understand how the code works, so I could be wrong.
"Not reflinked" isn't that simple -- a file reflinked within the same subvolume (eg. cp
does that by default nowadays) should be included. Then, compsize cares about extents not files, and a file may have reflinked extents even within itself.
Thus, I wonder if an argument like -s $DIR
that does set subtraction of extents would fit your use cases. That is: to get all extents that are included in the primary directory but not in the subtracted one.
Well, I currently have a really simple use case. All of my subvolumes are Postgres data directories. The one called mirror
is a Postgres slave which is in constant replication from the master. From time to time I create a snapshot (in the snapshot
directory) from the mirror
subvolume and start master Postgres database on the snapshot
subvolume (on another port). So I don't have reflinks from cp
. Data in the mirror
subvolume is originally filled by rsync
from the backup server and I don't know if rsync
does something with reflinks.
The more complex case would be creating snapshot2
subvolume from snapshot1
(which has been created from mirror
). If compsize -s snapshot1 snapshot2
would show me only extents for which snapshot2
is an exclusive owner that would be great.
@dkacar-oradian I did something along these lines using btrfs-python: https://github.com/daviessm/btrfs-snapshots-diff/blob/master/btrfs-subvol-size.py
If you cut out most of the printing lines the end result should be the answer to the question "how much space would I gain if I deleted these files in this subvolume?" - where "these files" could the the entire subvolume.
PS it's a bit slow for large subvolumes.