dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[CT-228] [Feature] Allow tests to warn/fail based on percentage

Open ChenyuLInx opened this issue 3 years ago • 27 comments

Is there an existing feature request for this?

  • [X] I have searched the existing issues

Describe the Feature

Add ability to support warn/fail threshold being a percentage instead of just fixed number. We will likely need to add the definition of the total number to calculate percentage with. Original request #4334

  - name: my_view
    columns:
      - name: id
        tests:
          - unique:
              config:
                warn_if: "> 10%"
                fail_if: "> 20%"

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

ChenyuLInx avatar Feb 14 '22 19:02 ChenyuLInx

If this is not already earmarked for development, I'd be interested (with some guidance) in contributing this.

jaypeedevlin avatar Feb 14 '22 20:02 jaypeedevlin

@jtcohen6 is this is something I could take on, do you think?

jaypeedevlin avatar Mar 08 '22 16:03 jaypeedevlin

Hey @jaypeedevlin sorry for getting back to you late! We haven't scheduled when to develop this yet. Thanks for the interest!! Your contribution is greatly welcomed! I can work with you if you have any questions.

ChenyuLInx avatar Mar 08 '22 20:03 ChenyuLInx

I am fairly new to this area, @jtcohen6 I took a look at the implementation, maybe we can modify here to convert warn_if and fail_if to something compare the fail_calc with total_number * percentage if a percentage is provided?

ChenyuLInx avatar Mar 08 '22 20:03 ChenyuLInx

@jaypeedevlin I'd be very excited to have your help here!

The tricky piece is finding a way to get the total number of records (with where filter included, I think). For example, in the case where warn_if: "> 10%" and error_if: "!= 0", we'd want something like (half-rendered pseudo-code):

    select
      {{ fail_calc }} as failures,
      ({{ fail_calc }} / (select count(*) from {{ model }}) * 100) > 10 as should_warn,
      {{ fail_calc }} != 0 as should_error
    from (
      {{ main_sql }}
      {{ "limit " ~ limit if limit != none }}
    ) dbt_internal_test

The challenges I ran into when trying to do this in the past:

  • We don't have {{ model }} available in the materialization context, in the same way that we have it as an argument to the generic test macro. In the materialization, model refers to the test node, not the model it's selecting from.
  • We could hack this by duplicating logic with {{ get_where_subquery() }} and model.refs[0][0]... which is gross :(

It's A Known Thing that we need a better way to store, for each test node, the model the test is defined on, so that it can be accessible in lots of other places—including in the test materialization. Right now it's just based on its dependency. I'd be open to Language team's input on the best way to store that.

jtcohen6 avatar Mar 08 '22 21:03 jtcohen6

Hey all - I was looking to do this, and saw that there's a WIP at #5172 based on statically checking for a literal percent sign and requerying the raw model to compute the percentage when that's the case.

However my first instinct when I envisioned this feature was that fail_calc should simply be able to be a computation over all rows -- those that match the test condition and those that don't -- and that rows that don't match the condition should be signified by a specially-named boolean column. If that were possible then a percentage-based error case could be defined as something like fail_calc: 1.0 * count(dbt_test_failed) / count(*) without a special one-off implementation in dbt-core. On top of being simpler, it would also open up a variety of other use cases that involve comparing failure cases to the broader dataset -- e.g. outlier detection, comparison to rolling averages via window functions, comparison within a windowed partition group, etc.

Would there be a straightforward way to refactor default dbt test queries so that:

  • rather than (current behavior): failing cases being represented by "all rows that make it to the final result",
  • instead (proposed behavior): failing cases are represented by "all rows in the final result that have [a statically-defined magic column named e.g.] dbt_test_failed not equal to null [or maybe equal to true? I'm not sure which makes more sense]

?

vergenzt avatar May 18 '22 20:05 vergenzt

Hmm though that breaks backcompat -- maybe add a test config called +fail_calc_includes_passing_rows that defaults to false? Then by default, count(*) and the like will continue to work, and in order to do calculations that include passing rows (such as the 1.0 * count(dbt_test_failed) / count(*) example above), you'd need to opt in via +fail_calc_includes_passing_rows: true on the test.

vergenzt avatar May 18 '22 20:05 vergenzt

@vergenzt If I understand you right, you're thinking that the not_null test should evolve from:

select * from {{ model }}
where {{ column_name }} is null

To:

select
    case when {{ column_name }} is null then true else false end as dbt_test_failed
from {{ model }}

And the fail_calc, instead of being count(*), would be count(dbt_test_failed).

In order to calculate percentages, the fail_calc would simply evolve to 1.0 * count(dbt_test_failed) / count(*), or even (in the case of more-custom tests) to 1.0 * sum(case when dbt_test_failed then some_value else 0 end) / sum(some_value).

That's a neat idea! I think the SQL may be a bit trickier to write for unique, but for not_null and accepted_values it's intuitive enough. We'd need to rethink --store-failures very slightly; rather than saving the whole test query result to the database (i.e. the whole table), it should probably just save:

select * from (
    {{ test_query }}
)
where dbt_test_failed = true

@jaypeedevlin @ehmartens Curious to hear your thoughts here as well!

jtcohen6 avatar May 19 '22 11:05 jtcohen6

@jtcohen6 yep that's what I'm thinking! Though the example new SQL could be even simpler: select ({{ column_name }} is null) as dbt_test_failed from {{ model }}. 🙂

If there were a way to isolate tests' condition expressions then this could be even more generic:

{%- set test_condition_expr = ... %} {#- assume condition sql is in scope; maybe via macro arg somehow? 🤷  #}
... 
{%- if config.get('fail_calc_includes_passing_rows') %}
  select *, ({{ test_condition_expr }}) as dbt_test_failed from {{ model }}
{%- else %}
  select * from {{ model }} where ({{ test_condition_expr }})
{%- endif %}

Not sure how we'd get access to that expression in a backwards-compatible way though, given it looks like tests are currently written as full SQL select statements. 🤷

Also test_unique could simply be:

 select
     {{ column_name }} as unique_field,
-    count(*) as n_records
+    count(*) as n_records,
+    n_records > 1 as dbt_test_failed

 from {{ model }}
 where {{ column_name }} is not null
 group by {{ column_name }}
-having count(*) > 1

vergenzt avatar May 19 '22 13:05 vergenzt

Though the example new SQL could be even simpler

Heh, I opted for verbose-and-clear over nice-and-concise, but I agree that I like yours better :)

Another way toward backwards compatibility: as each generic test converts its SQL, it reconfigures its fail_calc accordingly:

{% macro default__test_unique(model, column_name) %}

{{ config(fail_calc = 'count(dbt_test_failed)') }}

 select
     {{ column_name }} as unique_field,
     count(*) as n_records,
     n_records > 1 as dbt_test_failed

 from {{ model }}
 where {{ column_name }} is not null
 group by {{ column_name }}

{% endmacro %}

At some point (dbt v2), we reset the default fail_calc from count(*) to count(dbt_test_failed). In the meantime, though it would be up to the user, to know which generic tests could now handle a percentage-based fail_calc.

I have a conjecture that it would be possible to do this in a backwards-compatible way, without requiring changes to the many many custom generic tests already out in the world, which this issue is too narrow to contain. I'll leave it an exercise to someone else, for now, to find a way to write some even-cleverer SQL, perhaps using (gulp) a correlated subquery. Doesn't mean that's the way we ought to go.

jtcohen6 avatar May 20 '22 15:05 jtcohen6

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

github-actions[bot] avatar Nov 17 '22 02:11 github-actions[bot]

Still interested in this! (Just commenting to dismiss Stalebot)

vergenzt avatar Nov 17 '22 14:11 vergenzt

I'm curious if there's been any updates on this functionality?

mat-grisham-cfa avatar May 03 '23 12:05 mat-grisham-cfa

This would be very useful is it still be worked on?

ajonesINQ avatar May 09 '23 13:05 ajonesINQ

Agreed that this would be very useful!

Bley5271 avatar May 23 '23 19:05 Bley5271

Is this still being worked on? Would be a great feature to have.

Zwagemaker avatar Aug 10 '23 06:08 Zwagemaker

I abandoned work on this, unfortunately

jaypeedevlin avatar Aug 10 '23 06:08 jaypeedevlin

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] avatar Feb 25 '24 01:02 github-actions[bot]

Still think this would be a very convenient feature

remeus avatar Feb 26 '24 16:02 remeus

Bumping the thread, because this would be extremely useful.

joaoreispv avatar Mar 21 '24 16:03 joaoreispv

Bumping again, would be a great feature to have!

phillipkamps avatar Apr 08 '24 15:04 phillipkamps

Hi All, I found a workaround with this by writing a generic test in dbt and then referencing it in the yml files like any other ordinary test:

This one works for a relationships test, has a 1% error threshold, and works for BigQuery. Feel free to change as you see fit.

{% test relationships_with_1_pct_error_threshold(model, column_name, field, to) %}

with parent as (
    select
        {{ field }} as id
    from {{ to }}
),

child as (
    select 
        {{ column_name }} as id
    from {{ model }}
),

error_threshold as (
    select
        cast(count(distinct id) * .01 as int) as num
    from parent
)

select distinct id
from child 
where id is not null
    and id not in (select id from parent)
    and (
        select count(distinct id) 
        from child
        where id is not null
            and id not in (select id from parent)
        ) > (select num from error_threshold)

{% endtest %}

gracie2339 avatar Apr 24 '24 20:04 gracie2339

Adding to the bump - we've custom implemented a generic "percent-based" test macro in our shop, but having this supported by the language (or giving us additional hooks/a more standard generic interface!) would also do wonders

Oracen avatar May 21 '24 01:05 Oracen

Bumping again, hopefully someone is working on it

sqnico avatar Aug 08 '24 05:08 sqnico

Also bumping, it's very frustrating to have to write a custom test when we need the error threshold to be based on a percentage of the total number of rows in the model. This seems like something that should be baked-in.

jmesterh avatar Aug 15 '24 17:08 jmesterh