redflag icon indicating copy to clipboard operation
redflag copied to clipboard

Mention & differentiate from similar packages in README

Open kwinkunks opened this issue 2 years ago • 0 comments

  • Great Expectations - seems big and cumbersome - lesson: stay lean
  • Evidently - "framework to evaluate, test and monitor ML models in production." - looks nice but quite plugged in to Jupyter, eg lots of plots
  • ydata-quality - from the same people as profiling (below), and does not look well maintained
  • Pandas Profiling - generates some alerts eg see below - should test this on my usual datasets

image

More specialized:

  • Pandera - "statistical data validation for pandas" (and only pandas) - lesson: support other data formats
  • pandas_dq - looks quite nice, pands only
  • Spectacles - continuous integration tool for Looker and LookML (a GCP service?)py
  • Datafold - time-series and point clouds?
  • dbt (Data Build Tool) - not sure what this is
  • Deequ - targeted at Spark/PySpark dataframes only I think

Couple of nice posts etc

  • http://mfcabrera.com/blog/pandas-dataa-validation-machine-learning.html
  • See #69

kwinkunks avatar Sep 22 '23 20:09 kwinkunks