Create/update a macro to verify sum of row counts of n relations
Describe the feature
There are situations where a model simply is a union all of various other models:
-- int_my_source
select * from {{ ref('stg_my_source_a') }}
union all
select * from {{ ref('stg_my_source_b') }}
union all
select * from {{ ref('stg_my_source_c') }}
In this example, it would be useful to have a 1:many test to verify that:
count(int_my_source) = count(stg_my_source_a) + count(stg_my_source_b) + count(stg_my_source_c)
Describe alternatives you've considered
equal_rowcount is the closest thing in dbt-utils, but only does 1:1 relations
Additional context
I don't think this feature is database-specific.
Who will this benefit?
This feature has been requested several times over the years (dbt internal Slack, previous GitHub issue); this feature will benefit those users.
Are you interested in contributing to this feature?
Sure, happy to help.
@aranke Thanks for raising this issue and linking those two other places where this has come up!
If the model is really that simple, what would be the value of a data test here? Wouldn't the counts be guaranteed to be equal? I don't see how the count of rows could be anything but exactly the same in this case.
Hi adding my 2 cents here and I raised this with dbt support too. It's useful in our case when the unioned model is incremental.
i.e. we're selecting from stg_my_source_a where timestamp > now - 2 days for example. And want to see if the unioned model still has all the rows in the individual staging models.
Hope that makes sense.
Thanks for that info @edmei1 !
Could you share an example model that shows more detail? Something like this:
models/int_my_source.sql
{{
config(
materialized='incremental'
)
}}
with unioned_models as (
select * from {{ ref('stg_my_source_a') }}
union all
select * from {{ ref('stg_my_source_b') }}
union all
select * from {{ ref('stg_my_source_c') }}
)
select *
from unioned_models
{% if is_incremental() %}
where event_time >= (select coalesce(max(event_time),'1900-01-01') from {{ this }} )
{% endif %}
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.