rmlint icon indicating copy to clipboard operation
rmlint copied to clipboard

[Questions] How to replace the original file with the duplicate with the oldest modification time?

Open amalgame21 opened this issue 10 months ago • 5 comments

Thank you for creating this utility, it is great. I am using this utility to keep all my files (mostly videos, audios and images ) centralised in one place. I usually run rmlint -pprg RANDOM_DIR_1 RANDOM_DIR_N // CENTRALISED_DIR -km to keep the CENTRALISED_DIR untouched, so if there are some files leftover in RANDOM_DIR afterwards, I put it inside CENTRALISED_DIR manually, and keep it well organized,.

However, I found that some of the files in my CENTRALISED_DIR are duplicates that have newer modification time. I want the oldest one in this directory. I know that there is an option -S m that can specify the oldest duplicates as original. However I do not know how to combine it with the above command. If I untag the CENTRALISED_DIR and apply -S m, some file inside this directory may be deleted because it may contains newer modification time. But later I do not know where it was originally located inside 'CENTRALISED_DIR' because it have many subdirectories in it, so it is very hard for me to manually move the leftover files in RANDOM_DIR to CENTRALISED_DIR

What should I do to solve this problem? thanks!

amalgame21 avatar Oct 13 '23 17:10 amalgame21

Did you figure this out?

cebtenzzre avatar Oct 13 '23 22:10 cebtenzzre

Yes, now I use reflink to do it before deleting it, which should be safer. First I use rmlint -pp -r -g -T df -S ma RANDOM_DIR_1 RANDOM_DIR_N CENTRALISED_DIR to find the oldest duplicate. Then I manually modify rmlint.sh to replace all remove_cmd with cp_reflink, then in the cp_reflink function, the lines with touch command is commented out. And then run the shell script. Lastly, I use rmlint -pp -r -g -T df -S ma RANDOM_DIR_1 RANDOM_DIR_N // CENTRALISED_DIR -km to delete all the duplicates in the untagged folder.

I was expecting appending -c sh:reflink in the above command would do it without manually modify the shell script, but it seems that it does not take care of modification time, which may generate a shell script with the original file with newer modification time.

amalgame21 avatar Oct 24 '23 21:10 amalgame21

I was expecting appending -c sh:reflink in the above command would do it without manually modify the shell script,

The order of arguments to reflink does not matter. Once two files are reflinked, they can only be told apart by their path. And the touch command is necessary to preserve the modification time, otherwise cp --reflink just sets the mtime to the current date.

cebtenzzre avatar Oct 24 '23 23:10 cebtenzzre

The order of arguments to reflink does not matter. Once two files are reflinked, they can only be told apart by their path. And the touch command is necessary to preserve the modification time, otherwise cp --reflink just sets the mtime to the current date.

In the cp_reflink function, the cp --archive --reflink=always "$2" "$1" set the mtime of $1 to be the mtime of $2, that's what I want. However, the touch commands before and after the cp command preserve the mtime of $1, I don't what that, I want the mtime of $1 from the mtime of the earlist file.

amalgame21 avatar Nov 03 '23 14:11 amalgame21

Ah, I see. You care about the order because of the way the touch command is run. I'll have to look into this.

cebtenzzre avatar Nov 03 '23 15:11 cebtenzzre