rmlint icon indicating copy to clipboard operation
rmlint copied to clipboard

(Btrfs clone) No space saved, even though advertised?

Open rrueger opened this issue 2 years ago • 2 comments

I ran a rmlint -g -c sh:clone -o sh:rmlint.sh command, and was told there were 500GB of duplicated data.

When running sudo rmlint.sh -xr, it became clear that some (~20%) of the data was already reflinked. (I presume that rmlint counts this as duplicate data, but cannot free any data, because the files already share the same extents).

There were many rmlint --dedupe --dedupe-readonly calls that appeared to be successful (along with some failures).

Notably, there were at least 20GB of files that were successfully rmlint --dedupe'd.

However btrfs filesystem usage still reported the exact same amount of used/free space. Even after a reboot.

I ran rmlint against an entire subvolume, whose data is exclusive to that subvolume.

How do I understand this? I understand that it highly likely that I am not understanding some core behaviour of btrfs.

Thank you!


Version info

$ rmlint --version
version 2.10.1 compiled: Dec  3 2021 at [01:09:27] "Ludicrous Lemur" (rev unknown)
compiled with: +mounts +nonstripped +fiemap +sha512 +bigfiles +intl +replay +xattr +btrfs-support
$ uname -r
5.15.11-arch2-1
$ btrfs --version
btrfs-progs v5.15.1

rrueger avatar Dec 27 '21 17:12 rrueger

Does this subvolume have any snapshots (btrfs subvolume list -s <fs root>)? Extents that are still referenced by snapshots will stay on disk. rmlint can deduplicate files within snapshots with -r, but in order to know about them but it needs to be given the path to the snapshot like any other directory.

cebtenzzre avatar Dec 27 '21 17:12 cebtenzzre

Thank you for your quick response.

Good point, rookie error on my behalf. There was another read-only snapshot $SNAP of the subvolume $SUB.

I reran rmlint -g -c sh:clone -o sh:rmlint.sh $SUB $SNAP and was told there were 1.3TB of duplicated data.

I then executed the rmlint.sh script with -r as root and observed (for me) unexpected behaviour

  1. Similarly to the first run against only $SUB, there were many successful rmlint --dedupe --dedupe-readonly calls and a hand full of failures. However, only ~1GB of data was freed.
  2. rmlint tried to clone files within $SUB. I would have expected that my first rmlint ... $SUB run would have cloned these files to each other. My understanding here is that once two files have been rmlint --dedupe'd, rmlint --is-reflink returns true?* In this case, rmlint ... $SUB $SNAP should only be cloning files within $SNAP or between $SUB and $SNAP.
  3. rmlint --dedupe --dedupe-readonly is very slow. According to glances it only reads from disk at about 50MB/s (on an SSD from which I regularly read at 500MB/s+ sustained, from which rmlint reads at 1.2GB/s during other stages of execution). I suspect this is entirely unrelated, but am mentioning anyway in case it tells you something about my disk failing or having other issues. Sorry if this turns out to be a complete red herring.

Could it be that rmlint --dedupe --dedupe-readonly can only dedupe between two read-only subvolumes? (And not between a read-only, and a writeable subvolume)

*I tried to test this hypothesis, with

echo 123 > file
cp file gile
rmlint --dedupe file gile
rmlint --is-reflink file gile
 

but was returned an exit code 5, i.e. fiemaps can't be read.


Here is my filesystem usage, perhaps something sticks out. I have rebalanced and rebooted since the rmlint runs.

# btrfs filesystem usage /btrfs 
Overall:
    Device size:		   1.78TiB
    Device allocated:		   1.49TiB
    Device unallocated:		 292.97GiB
    Device missing:		     0.00B
    Used:			   1.47TiB
    Free (estimated):		 315.15GiB	(min: 315.15GiB)
    Free (statfs, df):		 315.15GiB
    Data ratio:			      1.00
    Metadata ratio:		      1.00
    Global reserve:		 512.00MiB	(used: 0.00B)
    Multiple profiles:		        no

Data,single: Size:1.48TiB, Used:1.46TiB (98.54%)
   /dev/mapper/computer-root	   1.48TiB

Metadata,single: Size:11.00GiB, Used:7.69GiB (69.88%)
   /dev/mapper/computer-root	  11.00GiB

System,single: Size:32.00MiB, Used:224.00KiB (0.68%)
   /dev/mapper/computer-root	  32.00MiB

Unallocated:
   /dev/mapper/computer-root	 292.97GiB

rrueger avatar Dec 28 '21 09:12 rrueger