fastdupes icon indicating copy to clipboard operation
fastdupes copied to clipboard

Differing files with common prefix detected as duplicates

Open TWAC opened this issue 8 years ago • 6 comments
trafficstars

Tested on Ubuntu 16.04.

#!/bin/sh
git clone https://github.com/ssokolow/fastdupes
cd fastdupes
mkdir files
seq 100000 > files/file1; echo "1" >> files/file1
seq 100000 > files/file2; echo "2" >> files/file2
cmp files/file1 files/file2
python fastdupes.py files
Cloning into 'fastdupes'...
remote: Counting objects: 279, done.
remote: Total 279 (delta 0), reused 0 (delta 0), pack-reused 279
Receiving objects: 100% (279/279), 93.39 KiB | 0 bytes/s, done.
Resolving deltas: 100% (116/116), done.
Checking connectivity... done.
files/file1 files/file2 differ: byte 588896, line 100001
Found 2 files to be compared for duplication.        
Found 1 sets of files with identical sizes. (2 files examined)             
Found 1 sets of files with identical header hashes. (2 files examined)             
Found 1 sets of files with identical hashes. (2 files examined)             
/tmp/fastdupes/files/file2
/tmp/fastdupes/files/file1

TWAC avatar Jun 22 '17 08:06 TWAC

Ugh. I hate these kinds of bugs that Indicate I somehow managed to fail to provide the kind of safety guarantee I thought.

The last few days have been busy, but I'll try to track this down as soon as possible.

ssokolow avatar Jun 22 '17 17:06 ssokolow

I'm currently fighting off a summer cold, so it'll be a little while before I get to this. Sorry for the delay.

ssokolow avatar Jun 25 '17 21:06 ssokolow

I understand. Anyway, I looked into it, and the normal hashing is behaving like the header hashing, see pull request.

TWAC avatar Jun 26 '17 08:06 TWAC

Thanks.

I woke up today with no more traditional symptoms, but no mental capacity either, so I'll review it once that clears up.

ssokolow avatar Jun 26 '17 18:06 ssokolow

OK, I'm back on my feet, but still catching up things that slipped. Hopefully, I'll have this fixed within the next few days.

ssokolow avatar Jul 07 '17 11:07 ssokolow

Ok, I'm back. Sorry for the silence.

Please continue discussion under PR #32.

ssokolow avatar Aug 20 '17 05:08 ssokolow