Tom Wollnik
Tom Wollnik
These changes would also be really useful for me. I would love to see this merged soon, if possible. @honnix @Tarrasch @dlstadther
Hi. I am assuming that you would like to know which rows failed a particular check. So for example, which rows had null values in a certain column. You can...
@aviatesk we really like this change and would be happy to review an updated version of this PR
@aviatesk please get back to us on this if you get the chance. We are considering closing this PR soon.
We like this idea, can you submit a PR?
Thanks. Don't worry about doing this quickly, we likely won't get around to reviewing the PR until mid august or end of august anyway.
We are open to developments in this direction. The implementation will be tricky as the anomaly detection needs to be adopted to accomodate the new kind of metrics. We currently...
Just to clarify: You want to track an aggregate metric for each of the historam bins, is that right? So, this would be logically similar to e.g. `data.groupBy("firstColumn").agg(count("*"), sum("secondColumn"))`. I...
One idea for a workaround would be to calculate the aggregates using regular spark, e.g. `df.groupBy("firstColumn").agg(sum("secondColumn"))`. Then, you could associate the output of this aggregation with the deequ results based...
Hi, thanks so much for introducing all these changes. Unfortunately, we currently don't have availability to give this a proper review. Will keep this PR in the backlog for now....