rdfind
rdfind copied to clipboard
Consider the number of hard links for a and b when deciding how to create the hard link between a and b
This is in part related to #18, but not only.
Consider this example:
# echo abc>a
# cp a b
# ln b b1
# ln b b2
# ln b c
# stat --format="name=%n inode=%i nhardlinks=%h" a* b* c*
name=a inode=12857931 nhardlinks=1
name=b inode=12857932 nhardlinks=4
name=b1 inode=12857932 nhardlinks=4
name=b2 inode=12857932 nhardlinks=4
name=c inode=12857932 nhardlinks=4
We start with a file set where 4 files use the same inode (b, b1, b2, c).
Then run
# rdfind -removeidentinode false -makehardlinks true ./a* ./b*
# stat --format="name=%n inode=%i nhardlinks=%h" a* b* c*
name=a inode=12857931 nhardlinks=4
name=b inode=12857931 nhardlinks=4
name=b1 inode=12857931 nhardlinks=4
name=b2 inode=12857931 nhardlinks=4
name=c inode=12857932 nhardlinks=1
Please note that:
cis not in the rdfind input!- you can remove
-removeidentinode falseto get the known "caveat" problem, but this is not the point
The result is that we've broken the link between b* and c and we've not gained any space.
This can be a problem when you have a set of "snapshots" created with rsync, linked together with --link-dest and you run rdfind on just some of these snapshots.
rdfind seems to take the first encountered file as the target for hard link creation. But, if it had taken one of the files with the highest number of hard links (b, b1 or b2), the result could have been:
name=a inode=12857932 nhardlinks=5
name=b inode=12857932 nhardlinks=5
name=b1 inode=12857932 nhardlinks=5
name=b2 inode=12857932 nhardlinks=5
name=c inode=12857932 nhardlinks=5
No links broken and space reclaimed!