dbt-redshift [ADAP-671] Statements executed one by one instead of as a whole block

Is this a regression in a recent version of dbt-redshift?

[X] I believe this is a regression in dbt-redshift functionality
[X] I have searched the existing issues, and I could not find an existing issue for this regression

Current Behavior

We have a macro in dbt that runs a set of DDL statements on Redshift. We concatenate those statements in a string (cmd) and run with "{% do run_query(cmd) %}". That string looks like below;

begin;
  ddl_statement_1;
  ddl_statement_2;
  ddl_statement_3;
  ...
  ...
  ddl_statement_x;
end;

After moving to dbt 1.5, we see that those statements are executed one by one, instead of sending the statement as a whole to database. We can see it both in database logs and dbt logs.

Impact

In 1.5 and onwards, these statements take more than 5 minutes to execute. Previously in versions 1.4 and below, they would execute in only a few seconds.

Expected/Previous Behavior

The below statements are executed at once in a single transaction.

begin;
  ddl_statement_1;
  ddl_statement_2;
  ddl_statement_3;
  ...
  ...
  ddl_statement_x;
end;

Steps To Reproduce

Install dbt-core 1.5
Install dot-redshift 1.5.6
Run a set of ddl statements between begin and end in a macro with "{% do run_query(cmd) %}"

Relevant log output

[0m14:34:35.967673 [debug] [MainThread]: On macro_xyz: drop view if exists db_name.schema_name.object_1;
[0m14:34:36.158399 [debug] [MainThread]: SQL status: SUCCESS in 0.0 seconds
[0m14:34:36.158975 [debug] [MainThread]: Using redshift connection "macro_xyz"
[0m14:34:36.159297 [debug] [MainThread]: On macro_xyz: create view db_name.schema_name.object_1 as select * from db_name.another_schema_name.object_1 with no schema binding;
[0m14:34:36.528864 [debug] [MainThread]: SQL status: SUCCESS in 0.0 seconds
[0m14:34:36.531540 [debug] [MainThread]: Using redshift connection "macro_xyz"
[0m14:34:36.532585 [debug] [MainThread]: On macro_xyz: grant select on table db_name.schema_name.object_1 to group db_group;
[0m14:34:36.729741 [debug] [MainThread]: SQL status: SUCCESS in 0.0 seconds
[0m14:34:36.730303 [debug] [MainThread]: Using redshift connection "macro_xyz"
[0m14:34:36.730638 [debug] [MainThread]: On macro_xyz: drop view if exists db_name.schema_name.object_2;
[0m14:34:36.923462 [debug] [MainThread]: SQL status: SUCCESS in 0.0 seconds
[0m14:34:36.926900 [debug] [MainThread]: Using redshift connection "macro_xyz"
[0m14:34:36.927759 [debug] [MainThread]: On macro_xyz: create view db_name.schema_name.object_2 as select * from db_name.another_schema_name.object_2 with no schema binding;
[0m14:34:37.306524 [debug] [MainThread]: SQL status: SUCCESS in 0.0 seconds
[0m14:34:37.307047 [debug] [MainThread]: Using redshift connection "macro_xyz"
[0m14:34:37.307360 [debug] [MainThread]: On macro_xyz: grant select on table db_name.schema_name.object_2 to group db_group;
[0m14:34:37.514141 [debug] [MainThread]: SQL status: SUCCESS in 0.0 seconds

Environment

- OS: MacOS 13.4 and Ubuntu
- Python: 3.9.16
- dbt-core (working version): 1.4.6
- dbt-redshift (working version): 1.4.6
- dbt-core (regression version): 1.5.0, 1.5.1, 1.5.2
- dbt-redshift (regression version): 1.5.1, 1.5.2, 1.5.6

Additional Context

No response

Jul 06 '23 09:07 naumovdalex

We noticed the same with our post-hook, what helped us to set autocommit: False in profiles.yml (more details here)

Jul 06 '23 10:07 jan-benisek

@naumovdalex sorry to hear your existing workflows is failing. As part of the migration to using the redshift_connector connector library in 1.5 instead of psycopg2, this adapter's approach to transactions have changed such that every query is executed at once.

Check out #463 for more context, and I'd love to see if you get the the intended behavior by doing as @jan-benisek says and adding autocommit: False to your profiles.yml.

I suspect that you may still encounter issues due to design decisions of the underlying connector library, for example see https://github.com/aws/amazon-redshift-python-driver/issues/162.

Jul 07 '23 18:07 dataders

@naumovdalex I'm going to close this issue for now. If possible, I think this would make a great reproducible use case to rationalize https://github.com/aws/amazon-redshift-python-driver/issues/162. Perhaps open an issue there detailing the limitation?

Jul 13 '23 13:07 dataders

We noticed the same with our post-hook, what helped us to set autocommit: False in profiles.yml (more details here)

I tried this, unfortunately that didn't help as well. Also, as part of our team uses dbt Cloud IDE as well, this would be a bit more complicated for us

Jul 14 '23 15:07 naumovdalex

@naumovdalex sorry to hear your existing workflows is failing. As part of the migration to using the redshift_connector connector library in 1.5 instead of psycopg2, this adapter's approach to transactions have changed such that every query is executed at once.

Check out #463 for more context, and I'd love to see if you get the the intended behavior by doing as @jan-benisek says and adding autocommit: False to your profiles.yml.

I suspect that you may still encounter issues due to design decisions of the underlying connector library, for example see aws/amazon-redshift-python-driver#162.

@dataders it's not that the workflows fail, but some macros are taking more than 5 minutes to execute, instead of just a few seconds

Jul 14 '23 15:07 naumovdalex

I understand now -- that not an ideal experience. @naumovdalex can you please provide a reproducible example? This would help the engineer at Redshift work on the above linked issue on their connector library

Jul 25 '23 13:07 dataders

@naumovdalex are you still experiencing this issue in the new versions of dbt-redshift?

Sep 17 '24 11:09 amychen1776