dbt-athena
dbt-athena copied to clipboard
Query isn't properly formatted when table is partitioned
Hi, I'm trying to run
{{ config(partitioned_by='day') }}
select
a,
b,
c
from random_table
I can see that in target/run
the query is being compiled to
create table
dev_datawarehouse.test
with (
partitioned_by=ARRAY'day',
format='parquet'
)
as
select
a,
b,
c
from random_table
Which gives the following error: SYNTAX_ERROR: Unknown type: ARRAY.
the compiled query should instead be
create table
dev_datawarehouse.test
with (
partitioned_by=ARRAY['day'],
format='parquet'
)
as
select
a,
b,
c
from random_table
Which works just fine. Am I missing something obvious?
Ok I've solved the problem: this is wrong {{ config(partitioned_by='day') }}
should be {{ config(partitioned_by=['day']) }}
what is less obvious is that if you declare the partition column at the project level like so:
example:
partitioned_by:
- day
you'll experience the same error. You should declare it like so:
example:
partitioned_by: [day]
This StackOverflow answer saying that the 2 forms are equivalent threw me off a bit https://stackoverflow.com/questions/23657086/yaml-multi-line-arrays
To partition by a date column, use syntax like this:
{{ config( partitioned_by = [ 'some_date_column' ] ) }}
select a, b, some_date_column
from random_table