rsync-time-backup
rsync-time-backup copied to clipboard
Detect moved or renamed files
Searching for this script, That I currently Use and it is great! I found this another one:
https://github.com/dparoli/hrsync
This does first a .shadow directory (that I recommend to put in on another choseable location). And it uses it to detect moved or renamed files...
That will be an excelent feature to add, since is common on end users to move around or rename files or even entire directories trees, and that will easily increase used space on backup, transfer amount and time, but it is actually not needed if track is keeped and pointed properly the links.
What do you think?
I like the idea. I was actually thinking about using fdupes
to detect duplicate files and replace them with hard-links on destination site, but the fdupes -L
option has been removed.
@joekerna @laurent22 perhaps you can find this functionality in http://write.flossmanuals.net/fslint/duplicates/ of the fslint program:
Command Line Interface
The command line interface to this utility is 'findup'. This utility will be found in the installation directory of FSlint.
$/usr/share/fslint/fslint/findup --help
find dUPlicate files.
Usage: findup [[[-t [-m|-d]] | [--summary]] [-r] [-f] paths(s) ...]
If no path(s) specified then the current directory is assumed.
When -m is specified any found duplicates will be merged (using hardlinks).
When -d is specified any found duplicates will be deleted (leaving just 1).
When -t is specfied, only report what -m or -d would do.
When --summary is specified change output format to include file sizes.
You can also pipe this summary format to /usr/share/fslint/fslint/fstool/dupwaste
to get a total of the wastage due to duplicates.
Examples:
search for duplicates in current directory and below
findup or findup .
search for duplicates in current directory and below listing the files full path
findup -f
search for duplicates in all linux source directories and merge using hardlinks
findup -m /usr/src/linux*
I'm currently experimenting with a tool named rmlint.
It detects empty directories and offers to delete them and it detects duplicates and offers to replace them with a hardlink. I'm thinking about running this tool at the end of the backup to compare latest
to the $DEST_FOLDER
.
My hope is, that it will find those moved/renamed files and replace them with hardlinks.
rmlint
is doing a great job. I've tried it out manually. With the correct options it extremely fast detects duplicate files (renamed/moved/...) and replaces those with hard-links
@joekerna Would you mind sharing what options you're using? I'd like to give it a shot.
rmlint -L -c sh:hardlink
-L
to ignore already hardlinked files
-c sh:hardlink
to replace duplicates with hardlinks instead of deleting them
@bombsandbottles Have you tried this yet? Anyone else? I'm interested in feedback
Thanks