spark-expectations
spark-expectations copied to clipboard
[FEATURE] Add historic rules
Is your feature request related to a problem? Please describe.
It would be nice to be able to express the desire for the new data to "look like" the old data (in terms of distribution).
Describe the solution you'd like
Since spark expectations collects summary stats already adding validation rules to allow there to be tolerances on the difference in today's summary v.s. the previous summary could be a good start.
Describe alternatives you've considered
I suppose we could write a query rule where folks just manually write the SQL query.
Additional context
TFDV goes above and beyond with it's historic views -- https://www.tensorflow.org/tfx/data_validation/get_started