dbt-databricks
dbt-databricks copied to clipboard
PARSE_SYNTAX_ERROR when running multiple queries with run_query()
Describe the bug
When trying to run multiple semi-colon separated queries to Databricks with the run_query()
macro, I get a syntax error PARSE_SYNTAX_ERROR
.
Steps To Reproduce
Below is a sample macro that reproduces this error for me.
{% macro my_macro() -%}
{% set sql_commands %}
select 1;
select 2;
{% endset %}
{{ run_query(sql_commands) }}
{%- endmacro %}
Expected behavior
My expectation is to be able to run a set of valid SQL statements separated by semi-colons. There are many examples of dbt macro code online (e.g. the grant_select
macro in the docs here) that show multiple SQL queries in a macro being run together. I'm able to do this with the example macro above (and others) in Snowflake for example.
My specific use-case is to build a macro that generates schema grants from our dbt code (similar to the example grant_select
linked to above). Currently I'm doing this via a work-around by making a custom macro run_queries()
that takes a list of SQL statements and executes them each separately.
Screenshots and log output
Here's the error output in the command line from the example macro run above.
22:06:14 Encountered an error while running operation: Runtime Error
Runtime Error
[PARSE_SYNTAX_ERROR] Syntax error at or near 'select': extra input 'select'.(line 2, pos 4)
== SQL ==
select 1;
select 2
----^^^
/* {"app": "dbt", "dbt_version": "1.7.11", "dbt_databricks_version": "1.7.13", "databricks_sql_connector_version": "2.9.5", "profile_name": "fleet_databricks", "target_name": "dev", "connection_name": "macro_my_macro"} */
System information
The output of dbt --version
:
Core:
- installed: 1.7.11
- latest: 1.7.11 - Up to date!
Plugins:
- databricks: 1.7.13 - Up to date!
- spark: 1.7.1 - Up to date!
The operating system you're using: macOS Sonoma Version 14.4.1
The output of python --version
:
Python 3.9.4
Additional context
I'm wondering if this is potentially related to the SparkSQL implementation in Databricks. I learned from our Databricks reps that each individual SQL command run in the Databricks SQL UI is generated against a separate SparkSQL environment.
As you suspect, this is currently a limitation of the platform. I've raised the issue internally, but can't commit to a timeline yet. Thanks for your report.
I guess in the mean time, one should do one of these instead:
{% macro my_macro() -%}
{% set sql_commands = ['select 1', 'select 2'] %}
{% for command in sql_commands %}
{% do run_query(command) %}
{% endfor %}
{%- endmacro %}
@jeremyyeo yes, pretty much. I haven't written a macro for it, but that's basically what I do internally in places where dbt expects us to be able to execute multiple statements.