elementary icon indicating copy to clipboard operation
elementary copied to clipboard

[ELE-59] View entire row when test fails

Open nzewail opened this issue 2 years ago • 6 comments

Is your feature request related to a problem? Please describe. When certain tests fail it would be helpful if the query provided didn't just show the failed columns but the entire row because that helps with debugging for certain teams.

For example, if a not_null test fails and I run the test query and it only returns the the single null column N times it failed the query isn't all that useful. Would be helpful to see the entire row or at least a primary key to be able to debug the specific failed rows

Describe the solution you'd like See entire failed row to help with debugging

Would you be willing to contribute this feature? I would be willing to contribute

ELE-59

nzewail avatar Nov 22 '22 01:11 nzewail

Hi @nzewail, thanks for opening the issue. This is an interesting problem. It happens because dbt's not_null test returns by default only the tested column, unless store_failures is on.

{% macro default__test_not_null(model, column_name) %}

{% set column_list = '*' if should_store_failures() else column_name %}

select {{ column_list }}
from {{ model }}
where {{ column_name }} is null

{% endmacro %}

I think that one potential solution would be to override dbt's default test so that it would return the entire row. Just tested it and it does work. What do you think? @oravi

elongl avatar Nov 22 '22 08:11 elongl

Looking at the other generic tests this could work well for not_null but for like accepted_values or unique seems supporting this feature would be more complicated

nzewail avatar Nov 22 '22 22:11 nzewail

@elongl I think the complexity here is that the user experience we should aim to create is that you don't need to change any of your existing test configurations. Just add enriched_results: true flag or something, and it will work. As dbt tests don't have hooks and each test is implemented differently (ideally you would want to support users custom tests, and packages like dbt_expectations), we need to find a creative way to solve this.

Overriding the tests is valid, but probably really only for dbt native tests. For custom and packages it won't solve it.

Maayan-s avatar Nov 23 '22 05:11 Maayan-s

I doubt that it's logically possible to provide a generic solution here. A test could be transformed to so many levels from its original table that identifying the failed results is nearly impossible. Our anomaly tests is a good example of that.

If we override dbt's native test in our package, the user don't need to change anything in their existing configuration, not even a flag, and it's really straightforward to do so. We do support sampling users custom tests and they can change the test to return whatever they'd want to see in the samples, so the problem is mainly with packages whose tests return values which are not indicative enough, most notably, dbt-expectations.

I think a good start would be to deal with dbt's generic tests, and maybe at a later stage deal with packages (in practice, dbt is also just a package). The way I think we would be able to do that in the future is by hooking the test query in the test's materialization like we already do now.

elongl avatar Nov 26 '22 15:11 elongl

Another user use-case.

elongl avatar Jan 24 '23 22:01 elongl

I am the aforementioned use case! +1 to making it a default behavior.

My team has been discussing how best to reduce the amount of digging required by the On Call dev in response to test failures. Our current (read: original) system sends the entire DBT testing log to a specific slack channel via airflow which means we just get the query that shows the failed fields rather than the entire row and then we have to reverse engineer it all to find the PKs, and so on and so forth.

We have some dreams regarding outputting the elementary_test_run_results to Power BI and developing dashboards that provide a clear window into data quality across all our environments. The thought of parsing the full row and returning the exact PKs, etc. is very exciting to us.

Bley5271 avatar Jan 24 '23 23:01 Bley5271