dbt-athena icon indicating copy to clipboard operation
dbt-athena copied to clipboard

Query isn't properly formatted when table is partitioned

Open Batto1300 opened this issue 2 years ago • 2 comments

Hi, I'm trying to run

{{ config(partitioned_by='day') }}
select 
    a,
    b,
    c
 from random_table

I can see that in target/run the query is being compiled to

create table
    dev_datawarehouse.test

    with (
        partitioned_by=ARRAY'day',
        format='parquet'
    )
  as
    
select 
    a,
    b,
    c
 from random_table

Which gives the following error: SYNTAX_ERROR: Unknown type: ARRAY. the compiled query should instead be

create table
    dev_datawarehouse.test

    with (
        partitioned_by=ARRAY['day'],
        format='parquet'
    )
  as
    
select 
    a,
    b,
    c
 from random_table

Which works just fine. Am I missing something obvious?

Batto1300 avatar Mar 18 '22 11:03 Batto1300

Ok I've solved the problem: this is wrong {{ config(partitioned_by='day') }} should be {{ config(partitioned_by=['day']) }} what is less obvious is that if you declare the partition column at the project level like so:

example:
      partitioned_by:
          - day

you'll experience the same error. You should declare it like so:

example:
      partitioned_by: [day]

This StackOverflow answer saying that the 2 forms are equivalent threw me off a bit https://stackoverflow.com/questions/23657086/yaml-multi-line-arrays

Batto1300 avatar Mar 18 '22 11:03 Batto1300

To partition by a date column, use syntax like this:

{{ config( partitioned_by = [ 'some_date_column' ] ) }}

select a, b, some_date_column
  from random_table

aut0clave avatar Mar 18 '22 11:03 aut0clave