dbt-utils icon indicating copy to clipboard operation
dbt-utils copied to clipboard

Add a new macro to cast all columns based on data types in YAML

Open b-per opened this issue 1 year ago • 1 comments

Describe the feature

Add a macro to automatically cast all the columns to the types defined in the YAML config.

Today, when creating a a model with a contract, if the columns are not in the correct type, they are not casted automatically. (e.g. if id is a varchar(30) in the contract but an int according to the upstream models, running the model will fail..

Describe alternatives you've considered

Cast every single column in the model, repeating information from the YAML file.

Additional context

This feature would work for every database (as long as they support the syntax cast abc as xyz) as the types would be defined by the users in the YAML config.

Who will this benefit?

People who want to define contracts on their models but need to cast all their columns.

This is especially useful for people landing all their data as varchar(...) and then wanting to get proper types in staging.

Are you interested in contributing this feature?

Yes.

This macro seems to be working

{% macro cast_columns_as_yaml() %}
    {% for column, column_data in model.columns.items() %}
        {%- if column_data.data_type -%}
            cast({{ column }} as {{ column_data.data_type }}) as {{ column }} {{"," if not loop.last}}
        {%- else -%}
            {{ column }} {{"," if not loop.last}}
        {%- endif %}
    {% endfor %}
{% endmacro %}

It can be called with

with source as (
    select * from ...
),

renamed as (
 ...
)

select
{{ cast_columns_as_yaml() }}
from renamed

I am happy to create a PR if people think it would be useful. The macro supports the case where the columns are defined but the type is not set for all columns (in that case, the type will be inherited from the upstream column)

b-per avatar Nov 18 '24 09:11 b-per

@b-per go for it

iamfj avatar Mar 15 '25 12:03 iamfj