data-quality topic
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
applied-ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
lakeFS
lakeFS - Data version control for your data lake | Git for data
feathr
Feathr – A scalable, unified data and AI engineering platform for enterprise
data-diff
Compare tables within or across databases
great_expectations
Always know what to expect from your data.
pointblank
Data quality assessment and metadata reporting for data frames and database tables
feast
The Open Source Feature Store for Machine Learning
versatile-data-kit
One framework to develop, deploy and operate data workflows with Python and SQL.