jaccard-similarity topic

List jaccard-similarity repositories
trafficstars

stringdistance

75
Stars
15
Forks
Watchers

A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard si...

datasketch

2.4k
Stars
290
Forks
Watchers

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

strutil

284
Stars
18
Forks
Watchers

Golang metrics for calculating string similarity and other string utility functions

html-similarity

206
Stars
23
Forks
Watchers

Compare html similarity using structural and style metrics

spark-stringmetric

58
Stars
6
Forks
Watchers

Spark functions to run popular phonetic and string matching algorithms

consimilo

62
Stars
4
Forks
Watchers

A Clojure library for querying large data-sets on similarity

tika-similarity

103
Stars
59
Forks
Watchers

Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.

lsh-semantic-similarity

16
Stars
2
Forks
Watchers

Locality Sensitive Hashing for semantic similarity (Python 3.x)

Text-Similarity

16
Stars
5
Forks
Watchers

A text similarity computation using minhashing and Jaccard distance on reuters dataset

segmentation_metrics

112
Stars
12
Forks
Watchers

A package to compute medical segmentation metrics.