dbt-utils
dbt-utils copied to clipboard
union should explicitly specify table name in casts
Bigquery gets confused by tables that have a column with the same name in CASTs, and will try to cast the entire table instead of the column.
-- causes error, as foo is a struct
select cast(`foo` as STRING) as `foo`
from `data-playground-1234`.`myschema`.`foo`
-- works when table name is specified
select cast(`foo`.`foo` as STRING) as `foo`
from `data-playground-1234`.`myschema`.`foo`
I think this is a simple as adding table.name
here, like:
{%- set col_name = adapter.quote(table.name) + '.' + adapter.quote(col_name) if col_name in table_columns[table] else 'null' %}
I would think that's a general problem with BigQuery and would have to be addressed everywhere you cast a column, not just in this macro? The other obvious "fix" is to not have columns with the same name as a table?
In my real world, it's a pre-existing table with a single column (plus id) so changing it will be difficult
I'm not sure I understand why needing to change it elsewhere is an argument against changing it here?
On Wed, Nov 6, 2019, 10:39 AM Claus Herther [email protected] wrote:
I would think that's a general problem with BigQuery and would have to be addressed everywhere you cast a column, not just in this macro? The other obvious "fix" is to not have columns with the same name as a table?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fishtown-analytics/dbt-utils/issues/173?email_source=notifications&email_token=AAD3MAFLZTXQFVZY7GWEDGLQSLQMPA5CNFSM4JJNCLA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDG64GI#issuecomment-550366745, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD3MAANSQZSMQQWOXXUNQLQSLQMPANCNFSM4JJNCLAQ .
I'll try to send a PR in the next day or two if you'd be willing to accept it
On Wed, Nov 6, 2019, 1:40 PM Pete Fein [email protected] wrote:
In my real world, it's a pre-existing table with a single column (plus id) so changing it will be difficult
I'm not sure I understand why needing to change it elsewhere is an argument against changing it here?
On Wed, Nov 6, 2019, 10:39 AM Claus Herther [email protected] wrote:
I would think that's a general problem with BigQuery and would have to be addressed everywhere you cast a column, not just in this macro? The other obvious "fix" is to not have columns with the same name as a table?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fishtown-analytics/dbt-utils/issues/173?email_source=notifications&email_token=AAD3MAFLZTXQFVZY7GWEDGLQSLQMPA5CNFSM4JJNCLA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDG64GI#issuecomment-550366745, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD3MAANSQZSMQQWOXXUNQLQSLQMPANCNFSM4JJNCLAQ .
I said as much in the linked PR, but to follow up on @clausherther's comment above: a good approach to address this issue on BQ might be to alias the table name instead of changing the source table/column names. I do wish that BigQuery was a little bit smarter here.... alas.
So, this works:
with data as (
select 1 as data
)
select cast(data.data as string) from data;
and this also works:
with data as (
select 1 as data
)
select cast(data as string) from data as data______dbt_generated______;
but this does not:
with data as (
select 1 as data
)
select cast(data as string) from data;
--> Invalid cast from STRUCT<data INT64> to STRING at [7:13]
All things considered, I think i like the table alias more that the column qualification. It can be really challenging to get quoting/capitalization exactly right on different databases in different contexts. I think it's straightforward here (nice work on that fix @wearpants) but it other contexts, it may be preferable to alias tables instead.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.
Hi @drewbanin, I also face this issue in my development with BigQuery, please help me to review the changes if it's make sense to you. Thanks!