jdupes
jdupes copied to clipboard
exclude duplicates with different names
Hi, for my it would be very useful to have a switch to exclude all matches with different filenames, so that files would only be considered duplicates if they have both the same contents and the same filename.
hi, same requirement here :)
I'll see what I can do.
Quick and dirty method:
Assume previous jdupes run like
time sudo jdupes \
--one-file-system \
--recurse \
--size \
--permissions \
--order=time \
-X size+=:5M \
-M \
/path/to/scan/ | tee jdupes_log.txt
and log output
11574824 bytes each:
/path/one/filename.txt
/path/backup/two/filename.txt
/path/somedir/backup/three/filename.txt
Define a function to parse log file for exact same filenames in each set
strict_link() {
IFS=$'\n'; for line in $(cat "$1"); do
basename=$(basename "$line" 2>/dev/null);
if [[ "$line" =~ "bytes each:" ]]; then
unset link oldline;
elif [[ -n "$link" ]]; then
if [[ "$basename" == "$link" ]]; then
echo "$oldline -> $line";
[[ "$2" == "list" ]] && ls -l "$oldline" "$line";
[[ "$2" == "size" ]] && du -csh "$oldline" "$line";
[[ "$2" =~ sum ]] && $2 "$oldline" "$line";
[[ "$2" == "hard" ]] && ln -f "$oldline" "$line";
[[ "$2" == "soft" ]] && ln -sf "$oldline" "$line";
else
unset link oldline;
fi;
else
link="$basename";
oldline="$line";
fi;
done
}
USAGE: strict_link jdupes_log.txt [list|size|md5sum|hard|soft]