duplicate-documents topic

List duplicate-documents repositories

LSH

274
Stars
77
Forks
Watchers

Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents