fdupes icon indicating copy to clipboard operation
fdupes copied to clipboard

RFE: tell fdupes to always prefer a certain directory

Open macau23 opened this issue 4 years ago • 19 comments

I'd like to be able to tell fdupes to always prefer to preserve files within a given directory if duplicates are found.. At the moment the order of presented duplicates is not deterministic and it means that deleting duplicate files takes a lot longer:

$ fdupes -r --delete .
[1] ./aaa/1.txt
[2] ./bbb/1.txt
Set 1 of 2, preserve files [1 - 2, all]: x

[1] ./bbb/2.txt
[2] ./aaa/2.txt
Set 1 of 2, preserve files [1 - 2, all]: x

After the suggestion:

$ fdupes -r --prefer ./aaa/ --delete .
[1] ./aaa/1.txt
[2] ./bbb/1.txt
Set 1 of 2, preserve files [1 - 2, all]: x

[1] ./aaa/2.txt
[2] ./bbb/2.txt
Set 1 of 2, preserve files [1 - 2, all]: x

This enables me to use --noprompt without losing files from the wrong directory.

macau23 avatar Mar 03 '20 08:03 macau23

I'd like to see this feature too!

sandrotosi avatar Mar 03 '20 14:03 sandrotosi

very needed feature to automatically delete all the doubles of pictures imported and imported again from and to different devices

da-sti avatar Jan 06 '21 05:01 da-sti

Can PR #144 be reviewed and merged?

This is a very needed feature for situations where directory A has some of the same files as B, but not in the same structure.

Happened to me when trying to merge a relative's photo library with mine. They had a lot of my pics but had rearranged them.

I want to run fdupes -dNr -keep=A A B

bjhartin avatar Feb 18 '22 15:02 bjhartin

jdupes, a fdupes fork, is doing that with "-O" As this feature is requested since years i assume that it will never come to fdupes

Friday13th87 avatar Nov 15 '22 12:11 Friday13th87

jdupes, a fdupes fork, is doing that with "-O"

I'm the author of jdupes. No, it does not. That's the parameter order priority flag and only controls sorting.

jbruchon avatar Nov 15 '22 14:11 jbruchon

jdupes, a fdupes fork, is doing that with "-O"

I'm the author of jdupes. No, it does not. That's the parameter order priority flag and only controls sorting.

the order priority_flag is controlling the sorting, meaning you can say that duplicates should rather be deleted in dir2 then in dir1 for example. ok, so far so good.

the question of the topic was:

"tell fdupes to always prefer a certain directory"

The ability to set priorities is exactly doing that "prefer a directory" Meaning: jdupes -rNdO dir1/ dir2/ is setting the preserve priority to dir1, so dir1 is first and duplicates will be deleted in dir2 rather then in dir1, and that was the question.

My answer was correct in every way.

Friday13th87 avatar Nov 16 '22 07:11 Friday13th87

No, it's not. Your ego is not in question here; your correctness is. The parameter order controls the sorting, not the "preserve priority." Deletions will gladly nuke items in the first directory specified. It'll delete files in dir1 all day long. The request was to "always prefer to preserve files within a given directory." -O will (probably) preserve the first file in dir1 but all the rest of the files in dir1 in the set will be deleted. Your answer is only correct for the simplistic example in the original post. Most data sets are not nearly so simple. "The parameter order flag will 'always prefer to preserve files within a given directory'" is a false statement.

I will not entertain further discussion on this. You can't tell me I don't know how the program I wrote works.

jbruchon avatar Nov 16 '22 15:11 jbruchon

No, it's not. Your ego is not in question here; your correctness is. The parameter order controls the sorting, not the "preserve priority." Deletions will gladly nuke items in the first directory specified. It'll delete files in dir1 all day long. The request was to "always prefer to preserve files within a given directory." -O will (probably) preserve the first file in dir1 but all the rest of the files in dir1 in the set will be deleted. Your answer is only correct for the simplistic example in the original post. Most data sets are not nearly so simple. "The parameter order flag will 'always prefer to preserve files within a given directory'" is a false statement.

I will not entertain further discussion on this. You can't tell me I don't know how the program I wrote works.

No, sorry you are not right andthis is not about my ego, its about your ego sadly, i just wanted to help and you aredoing exactly the opposite to proof i-dont-know-what.

The initial poster was searching for a solution to prefer one driectory over another, which means "if possible delete from directory x and not from y, if its not possible do what you have to do" ṕrefering one directory over another doesnt mean that the initial poster was searching a solution to prohibit deletions from one directory, just prefering to delete from one directory that if there is a duplicate in both dirs it will be left at one specific dir. jdupe dir1/ dir2/ is doing this.

and for @bjhartin with jdupes you can do as you wish easily with:

chmod 555 -R dir1/ [--> jdupes cant delete files here, but calculate hashes etc.]
jdupes -rNdO dir1/ dir2/
chmod 755 -R dir1/ [or whatever privilegs you like to give the folder]

i hope that helped.

Friday13th87 avatar Nov 16 '22 16:11 Friday13th87

No, it's not. Your ego is not in question here; your correctness is.

@Friday13th87 You're being really unhelpful. The fellow said he's the author of jdupes; shut it down. Whether you think you're right no longer matters.

You're giving advice you claim as authoritative when the author of the program refuted you.

To anyone visiting this thread (and likely any others with this Friday person): caveat emptor.

I came looking for a way to do this thing, too, incidentally. I'd love a way to --prefer /some/arbitrary/master/path in one of these tools.

I suppose it's back to setting the "master" as read-only and running fdupes to see if it blows up.

JohnCrafton avatar Nov 17 '22 20:11 JohnCrafton

The fellow said he's the author of jdupes; shut it down. Whether you think you're right no longer matters.

To be fair: I didn't write every piece of code in jdupes and it's entirely possible to trip over my own human errors. The code behind -O, however, I personally wrote and tested. I know exactly what it does and there's a good chance I don't have dementia (yet). Fortunately, I can be completely mentally broken and anyone can still see exactly how it works.

jbruchon avatar Nov 17 '22 20:11 jbruchon

@JohnCrafton you might find the example scripts in the jdupes code base to be useful. I recognized that many people want to perform custom actions that the core program doesn't handle, so I wrote some template/example shell scripts that can be modified to suit your needs. They should also be able to use fdupes instead of jdupes as long as you check the options passed to the program. The output format is the same (duplicate items one per line with an empty line between duplicate sets). You can use grep to match a substring and decide to not act upon a specific directory or file, for example.

jbruchon avatar Nov 17 '22 20:11 jbruchon

I came looking for a way to do this thing, too, incidentally. I'd love a way to --prefer /some/arbitrary/master/path in one of these tools.

I suppose it's back to setting the "master" as read-only and running fdupes to see if it blows up.

If you don't need it to run unsupervised (via -N) then you can use the new fdupes interactive mode to do this:

selb /some/arbitrary/master/path isel ds prune

The first command will select every file in your "master" path, the second will deselect those and instead select their duplicates, the third will mark the now selected ones for deletion, and the last one will delete them.

adrianlopezroche avatar Nov 17 '22 20:11 adrianlopezroche

Been awhile. Somewhere I saw this recommended for choosing which to delete:

fdupes -r dir1 dir2|grep dir1/|xargs rm

I can't get that to work on macOS, and I am sure someone here can suggest why. This is an alternative method of getting what you want.

101Dude avatar Oct 23 '23 09:10 101Dude

@101Dude you should not use rm with xargs, it will do the wrong thing with spaces or files that need quoting.

macau23 avatar Nov 28 '23 09:11 macau23

@macau23 this is what I ended up using and it works well.

xargs runs into issues when path names have special characters. fdupes doesn't have a -print0 option like find does - it trips up.

The following command results in an error because of a single quote in a filename:

fdupes -r dir1 dir2|grep dir1/|xargs rm

xargs: unterminated quote

The UNIX way around this is to add another command between the grep and xarg commands:

... | tr '\n' '\0' | xargs -0 -n1 ...

This addition comes from an excellent explanation at Make xargs execute the command once for each line of input

The full command would then be:

fdupes -r dir1 dir2 |grep "dir2/" |tr '\n' '\0' |xargs -0 -n1 rm -v

Check this command first using echo or another non-destructive command before using rm. Adding the -v option allows you to see what has been removed.

An example of a non-destructive option is to use the tag command (install with homebrew). Add a red Finder tag to files that are duplicates so you can manually select and drag to trash :)

fdupes -r dir1 dir2 | grep "dir2" | tr '\n' '\0' | xargs -0 -n1 -I % tag -a red %

101Dude avatar Nov 29 '23 00:11 101Dude

same request

sylvainsab avatar Mar 04 '24 07:03 sylvainsab

Any solution?

VD171 avatar Mar 25 '24 17:03 VD171