jaccard-similarity topic
stringdistance
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard si...
datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
strutil
Golang metrics for calculating string similarity and other string utility functions
html-similarity
Compare html similarity using structural and style metrics
spark-stringmetric
Spark functions to run popular phonetic and string matching algorithms
consimilo
A Clojure library for querying large data-sets on similarity
tika-similarity
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
lsh-semantic-similarity
Locality Sensitive Hashing for semantic similarity (Python 3.x)
Text-Similarity
A text similarity computation using minhashing and Jaccard distance on reuters dataset
segmentation_metrics
A package to compute medical segmentation metrics.