dataquality topic

List dataquality repositories

OpenMetadata

5.4k
Stars
1.0k
Forks
Watchers

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...

data-diff

2.9k
Stars
240
Forks
Watchers

Compare tables within or across databases

zingg

902
Stars
109
Forks
Watchers

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

chaos_genius

707
Stars
81
Forks
Watchers

ML powered analytics engine for outlier detection and root cause analysis.

great_expectations

9.6k
Stars
1.5k
Forks
69
Watchers

Always know what to expect from your data.

deequ

3.2k
Stars
513
Forks
Watchers

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

DataCleaner

572
Stars
179
Forks
Watchers

The premier open source Data Quality solution

lale

324
Stars
83
Forks
Watchers

Library for Semi-Automated Data Science

re-data

1.5k
Stars
120
Forks
Watchers

re_data - fix data issues before your users & CEO would discover them 😊

amora-data-build-tool

46
Stars
4
Forks
Watchers

Amora Data Build Tool enables analysts and engineers to transform data on the data warehouse (BigQuery) by writing Amora Models that describe the data schema using Python's "PEP484 - Type Hints" and s...