redflag

redflag copied to clipboard

Reame
Issues

Mention & differentiate from similar packages in README

Open kwinkunks opened this issue 2 years ago • 0 comments

Great Expectations - seems big and cumbersome - lesson: stay lean
Evidently - "framework to evaluate, test and monitor ML models in production." - looks nice but quite plugged in to Jupyter, eg lots of plots
ydata-quality - from the same people as profiling (below), and does not look well maintained
Pandas Profiling - generates some alerts eg see below - should test this on my usual datasets

More specialized:

Pandera - "statistical data validation for pandas" (and only pandas) - lesson: support other data formats
pandas_dq - looks quite nice, pands only
Spectacles - continuous integration tool for Looker and LookML (a GCP service?)py
Datafold - time-series and point clouds?
dbt (Data Build Tool) - not sure what this is
Deequ - targeted at Spark/PySpark dataframes only I think

Couple of nice posts etc

http://mfcabrera.com/blog/pandas-dataa-validation-machine-learning.html
See #69

Sep 22 '23 20:09 kwinkunks