data-quality topic

List data-quality repositories

cleanlab

9.3k
Stars
722
Forks
Watchers

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

ydata-profiling

12.1k
Stars
1.6k
Forks
Watchers

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

applied-ml

26.1k
Stars
3.5k
Forks
912
Watchers

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

lakeFS

4.1k
Stars
330
Forks
Watchers

lakeFS - Data version control for your data lake | Git for data

feathr

2.0k
Stars
260
Forks
Watchers

Feathr – A scalable, unified data and AI engineering platform for enterprise

data-diff

2.9k
Stars
240
Forks
Watchers

Compare tables within or across databases

great_expectations

9.6k
Stars
1.5k
Forks
69
Watchers

Always know what to expect from your data.

pointblank

835
Stars
51
Forks
Watchers

Data quality assessment and metadata reporting for data frames and database tables

feast

5.3k
Stars
944
Forks
Watchers

The Open Source Feature Store for Machine Learning

versatile-data-kit

413
Stars
54
Forks
Watchers

One framework to develop, deploy and operate data workflows with Python and SQL.