jdupes icon indicating copy to clipboard operation
jdupes copied to clipboard

exclude duplicates with different names

Open madsurgeon opened this issue 4 years ago • 3 comments

Hi, for my it would be very useful to have a switch to exclude all matches with different filenames, so that files would only be considered duplicates if they have both the same contents and the same filename.

madsurgeon avatar May 04 '21 11:05 madsurgeon

hi, same requirement here :)

arafesthain avatar Dec 28 '21 15:12 arafesthain

I'll see what I can do.

jbruchon avatar Dec 28 '21 15:12 jbruchon

Quick and dirty method:

Assume previous jdupes run like

time sudo jdupes \
    --one-file-system \
    --recurse \
    --size \
    --permissions \
    --order=time \
    -X size+=:5M \
    -M \
    /path/to/scan/ | tee jdupes_log.txt

and log output

11574824 bytes each:
/path/one/filename.txt
/path/backup/two/filename.txt
/path/somedir/backup/three/filename.txt

Define a function to parse log file for exact same filenames in each set

strict_link() {
IFS=$'\n'; for line in $(cat "$1"); do 
    basename=$(basename "$line" 2>/dev/null); 
    if [[ "$line" =~ "bytes each:" ]]; then
        unset link oldline;
    elif [[ -n "$link" ]]; then
        if [[ "$basename" == "$link" ]]; then
            echo "$oldline -> $line";
            [[ "$2" == "list" ]] && ls -l "$oldline" "$line";
            [[ "$2" == "size" ]] && du -csh "$oldline" "$line";
            [[ "$2" =~ sum ]] && $2 "$oldline" "$line";
            [[ "$2" == "hard" ]] && ln -f "$oldline" "$line";
            [[ "$2" == "soft" ]] && ln -sf "$oldline" "$line";
        else
            unset link oldline;
        fi;
    else
        link="$basename";
        oldline="$line";
    fi;
done
}

USAGE: strict_link jdupes_log.txt [list|size|md5sum|hard|soft]

justbrowsing avatar Jan 03 '23 04:01 justbrowsing