comparator
comparator copied to clipboard
Similarity and distance measures for clustering and record linkage applications in R
Results
3
comparator issues
Sort by
recently updated
recently updated
newest added
@ngmarchant the Levenshtein distance can be implemented using only two rows for `dmat`, instead of using a square matrix. That could significantly reduce memory usage when comparing long sequences (400...
Consider adding support for token-based comparators. After mapping strings to token sets, the similarity of the sets can be measured using: * Cosine similarity * Sørensen–Dice coefficient * Jaccard index...
These measures are currently implemented in R. Porting to C++ is challenging, as it may be necessary to call an R function (the inner measure) from C++.