searchcode-server icon indicating copy to clipboard operation
searchcode-server copied to clipboard

Duplicates detection

Open quasarea opened this issue 8 years ago • 4 comments

I think you are already collecting hashes of files, it would be useful to results in duplicates compacted into single result with multiple paths

quasarea avatar Jun 15 '17 09:06 quasarea

This is something that I need in order to port searchcode.com over as well. I am still investigating adding a Simhash calculation in order to also mark similar files as duplicates. An example of how it is implemented there is,

https://searchcode.com/?q=jquery

Note that after a pause that a "Show 100 matches" pops in which when clicked displays the duplicates.

boyter avatar Jun 15 '17 22:06 boyter

Yep, that looks good, you had so much great stuff there ;p

quasarea avatar Jun 16 '17 10:06 quasarea

Glad you think so. It certainly is a lot to port across!

boyter avatar Jun 18 '17 20:06 boyter

Pushing this one out to 1.3.12

boyter avatar Sep 06 '17 08:09 boyter