dbt_utils macro interact with --Empty
resolves #https://github.com/dbt-labs/dbt-utils/issues/1009
Problem
As part of our CI process, we use the --empty tag to run and test modified models with no data. However, when these models use dbt_utils macros that expect non-empty values, we encountered errors in the pipeline.
Examples
Create seeds/test_a.csv
kpi,category,sign
A,C_1,+
B,C_2,-
B,C_3,
Test 1. Error we see by triggering dbt run --select test_dbt_utils_pivot --empty
models/test_dbt_utils_pivot.sql
select
{{ dbt_utils.pivot(
'category',
[],
agg='sum',
quote_identifiers=true,
)
}}
from {{ ref('test_a') }}
001003 (42000): SQL compilation error:
syntax error line 8 at position 0 unexpected 'from'.
Test 2. Error we see by triggering dbt run --select test_dbt_utils_get_column_values --empty
models/test_dbt_utils_get_column_values.sql
select
{{ dbt_utils.pivot(
'category',
dbt_utils.get_column_values(ref('test_a'), 'category',default=[]),
agg='sum',
quote_identifiers=true,
)
}}
from {{ ref('test_a') }}
'str object' has no attribute 'database'
> in macro load_relation (macros/adapters/relation.sql)
> called by macro default__get_column_values (macros/sql/get_column_values.sql)
> called by macro get_column_values (macros/sql/get_column_values.sql)
> called by model test_dbt_utils (models/test_dbt_utils.sql)
Test 3. Error we see by triggering dbt run --select test_dbt_utils_get_query_results_as_dict --empty
models/test_dbt_utils_get_query_results_as_dict.sql
{% set kpi_mapping_sql %}
SELECT * FROM {{ ref("test_a") }}
{% endset %}
{%- set kpis = dbt_utils.get_query_results_as_dict(kpi_mapping_sql) -%}
{% for kpi in kpis["KPI"] %}
{% set category = kpis["CATEGORY"][loop.index0] %}
{% set sign = kpis["SIGN"][loop.index0] %}
select '{{ kpi }}' as kpi, '{{ category }}' as category, '{{ sign }}' as sign
{{ "UNION ALL" if not loop.last }}
{% endfor %}
001003 (42000): SQL compilation error:
syntax error line 5 at position 2 unexpected ')'.
Solution
- Changes with
pivot.sql
Added a condition to check if the values are empty (None or []). If the values are empty, the macro returns the default
then_valueimmediately.
- Changes with
get_column_values.sql
Due to -- empty, the target_relation is overridden by 'select * from "DATABASE"."SEEDS"."TEST_A" where false limit 0 ' Added a condition to check if the target_relation is in the valid database.schema.table format. If not, the macro returns the default value immediately.
- Changes with
get_query_results_as_dict.sql
Added a condition to check if the column is empty. If yes, assign empty value to it.(I tested a few different dummy values and ' ' works for most cases.)"
Checklist
- [x] This code is associated with an issue which has been triaged and accepted for development.
- [x] I have read the contributing guide and understand what's expected of me
- [x] I have run this code in development and it appears to resolve the stated issue
- [x] This PR includes tests, or tests are not required/relevant for this PR
- [x] I have updated the README.md (if applicable)