dbt-project-evaluator icon indicating copy to clipboard operation
dbt-project-evaluator copied to clipboard

handle mising objects for strings

Open dave-connors-3 opened this issue 1 year ago • 2 comments

This is a:

  • [x] bug fix PR with no breaking changes
  • [ ] new functionality

Link to Issue

Closes #176

Description & motivation

When parts of the unpack macros didn't exist for some node types (i.e. tests don't have on_schema_change settings) the wrap_string_with_quotes macro returns an empty string. This change returns null for cases where the object doesn't exist.

Integration Test Screenshot

image

I added a not_empty_string_test (from @epapineau! this is an open PR on the dbt utils repo) to all the varchar fields in int_all_graph_resources -- decided not to push them up in case we don't want 30 more tests in the integration tests flow, but can add them here if we feel they are important

Checklist

  • [x] I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
    • [ ] BigQuery
    • [x] Postgres
    • [x] Redshift
    • [x] Snowflake
  • [ ] I have updated the README.md (if applicable)
  • [ ] I have added tests & descriptions to my models (and macros if applicable)
  • [ ] I have added an entry to CHANGELOG.md

dave-connors-3 avatar Aug 16 '22 16:08 dave-connors-3

Copying the info from Slack and as discussed.

The casting is actually from dbt_utils.union_relations.

{% for col_name in ordered_column_names -%}

    {%- set col = column_superset[col_name] %}
    {%- set col_type = column_override.get(col.column, col.data_type) %}
    {%- set col_name = adapter.quote(col_name) if col_name in relation_columns[relation] else 'null' %}
    cast({{ col_name }} as {{ col_type }}) as {{ col.quoted }} {% if not loop.last %},{% endif -%}

{%- endfor %}

Which means we can use the column_override argument of the macro to make the failing columns cast as string/varchar.

b-per avatar Aug 19 '22 10:08 b-per

@b-per updated the logic to handle nulls a bit more gracefully -- local integration tests are running fine for databricks (both folders). Can't seem to replicate the error from Circle CI

dave-connors-3 avatar Aug 26 '22 19:08 dave-connors-3

Does this address Benoit's point about the union_relations macro? I'm not seeing the column_override being used

graciegoheen avatar Sep 06 '22 16:09 graciegoheen

@graciegoheen this change should avoid the need to pass those overrides at all -- this will ensure that the null is properly cast as strings before we attempt to union the models together, so we should not see the VOID datatype at all

dave-connors-3 avatar Sep 06 '22 17:09 dave-connors-3

You mentioned that you don't want to add 30 tests to check if strings are not empty, which is fair. Can we add at least one or two tests though? I think it will be useful to avoid regressions when we updated the code and potentially add support for more DWs.

b-per avatar Sep 13 '22 07:09 b-per

You mentioned that you don't want to add 30 tests to check if strings are not empty, which is fair. Can we add at least one or two tests though? I think it will be useful to avoid regressions when we updated the code and potentially add support for more DWs.

@b-per -- great idea! just added a test to fail if any values are empty string values in the int_all_graph_resources.on_schema_change column, which was one of the columns that had this issue!

dave-connors-3 avatar Sep 13 '22 13:09 dave-connors-3