Tom Wollnik comments

Results 23 comments of


                                            Tom Wollnik

Add ExternalProgramTask param debug_environment

These changes would also be really useful for me. I would love to see this merged soon, if possible. @honnix @Tarrasch @dlstadther

How to save Failed Records of Deequ check in a CSV file, like Null/Empty/Negative checks etc.

Hi. I am assuming that you would like to know which rows failed a particular check. So for example, which rows had null values in a certain column. You can...

parameterize `RetainCompletenessRule`

@aviatesk we really like this change and would be happy to review an updated version of this PR

parameterize `RetainCompletenessRule`

@aviatesk please get back to us on this if you get the chance. We are considering closing this PR soon.

Improve usability of verification suite results if validation is a success

We like this idea, can you submit a PR?

Improve usability of verification suite results if validation is a success

Thanks. Don't worry about doing this quickly, we likely won't get around to reviewing the PR until mid august or end of august anyway.

Histogram anomaly detector

We are open to developments in this direction. The implementation will be tricky as the anomaly detection needs to be adopted to accomodate the new kind of metrics. We currently...

Enhancement Request: Histogram with aggregate on a second column

Just to clarify: You want to track an aggregate metric for each of the historam bins, is that right? So, this would be logically similar to e.g. `data.groupBy("firstColumn").agg(count("*"), sum("secondColumn"))`. I...

Enhancement Request: Histogram with aggregate on a second column

One idea for a workaround would be to calculate the aggregates using regular spark, e.g. `df.groupBy("firstColumn").agg(sum("secondColumn"))`. Then, you could associate the output of this aggregation with the deequ results based...

Features and enhancements done

Hi, thanks so much for introducing all these changes. Unfortunately, we currently don't have availability to give this a proper review. Will keep this PR in the backlog for now....