Dedupe.io
Dedupe.io
dedupe
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
csvdedupe
:id: Command line tool for deduplicating CSV files
dedupe-examples
:id: Examples for using the dedupe library
affinegap
:triangular_ruler: A Cython implementation of the affine gap string distance
address-matching
Python script for matching a list of messy addresses against a gazetteer using dedupe.
dedupe-geocoder
:round_pushpin: Demonstration of how dedupe might be used as geocoder
hcluster
Hierarchical Clustering Algorithms
pyhacrf
:triangular_ruler: Hidden alignment conditional random field for classifying string pairs.
pylbfgs
:mountain_cableway: Python/Cython wrapper for liblbfgs