Entity resolution topic
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
data-matching-software
A list of free data matching and record linkage software.
FEBRL-fork-v0.4.2
Fork of the Freely Extensible Biomedical Record Linkage program
recordlinkage
A powerful and modular toolkit for record linkage and duplicate detection in Python
recordlinkage-annotator
A browser user interface for manual labeling of record pairs.
dedupe
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Entity-Linking-Recent-Trends
Recent trends of Entity Linking, Disambiguation, and Representation.
vert-papers
This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microso...
csvdedupe
:id: Command line tool for deduplicating CSV files
dedupe-examples
:id: Examples for using the dedupe library