rsync-time-backup icon indicating copy to clipboard operation
rsync-time-backup copied to clipboard

Detect moved or renamed files

Open braian87b opened this issue 6 years ago • 7 comments

Searching for this script, That I currently Use and it is great! I found this another one:

https://github.com/dparoli/hrsync

This does first a .shadow directory (that I recommend to put in on another choseable location). And it uses it to detect moved or renamed files...

That will be an excelent feature to add, since is common on end users to move around or rename files or even entire directories trees, and that will easily increase used space on backup, transfer amount and time, but it is actually not needed if track is keeped and pointed properly the links.

What do you think?

braian87b avatar Nov 16 '17 18:11 braian87b

I like the idea. I was actually thinking about using fdupes to detect duplicate files and replace them with hard-links on destination site, but the fdupes -L option has been removed.

joekerna avatar Nov 17 '17 07:11 joekerna

@joekerna @laurent22 perhaps you can find this functionality in http://write.flossmanuals.net/fslint/duplicates/ of the fslint program:

Command Line Interface

The command line interface to this utility is 'findup'. This utility will be found in the installation directory of FSlint.

 $/usr/share/fslint/fslint/findup --help
 find dUPlicate files.
 Usage: findup [[[-t [-m|-d]] | [--summary]] [-r] [-f] paths(s) ...]

 If no path(s) specified then the current directory is assumed.
  
 When -m is specified any found duplicates will be merged (using hardlinks).
 When -d is specified any found duplicates will be deleted (leaving just 1).
 When -t is specfied, only report what -m or -d would do.
 When --summary is specified change output format to include file sizes.
 You can also pipe this summary format to /usr/share/fslint/fslint/fstool/dupwaste
 to get a total of the wastage due to duplicates.

 Examples:
 search for duplicates in current directory and below
     findup or findup .
 search for duplicates in current directory and below listing the files full path
     findup -f
 search for duplicates in all linux source directories and merge using hardlinks
     findup -m /usr/src/linux*

Wikinaut avatar Mar 08 '18 20:03 Wikinaut

I'm currently experimenting with a tool named rmlint.

It detects empty directories and offers to delete them and it detects duplicates and offers to replace them with a hardlink. I'm thinking about running this tool at the end of the backup to compare latest to the $DEST_FOLDER. My hope is, that it will find those moved/renamed files and replace them with hardlinks.

joekerna avatar Mar 21 '18 15:03 joekerna

rmlint is doing a great job. I've tried it out manually. With the correct options it extremely fast detects duplicate files (renamed/moved/...) and replaces those with hard-links

joekerna avatar Mar 22 '18 12:03 joekerna

@joekerna Would you mind sharing what options you're using? I'd like to give it a shot.

bombsandbottles avatar Mar 30 '18 18:03 bombsandbottles

rmlint -L -c sh:hardlink -L to ignore already hardlinked files -c sh:hardlink to replace duplicates with hardlinks instead of deleting them

joekerna avatar Mar 30 '18 19:03 joekerna

@bombsandbottles Have you tried this yet? Anyone else? I'm interested in feedback

Thanks

joekerna avatar Nov 19 '18 11:11 joekerna