elementary
elementary copied to clipboard
elementary generating insert statements with more than 16,384 elements, exceeding max values allowed for Snowflake
Describe the bug
migrated to elementary 0.14 from 0.12
Successfully ran dbt run -s elementary
tried running a single model via dbt run -s some_model
got error:
001795 (42601): SQL compilation error: error line 2 at position 25
maximum number of expressions in a list exceeded, expected at most 16,384, got 17,898
Log of when it happened.
13:44:29 On master: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "REDACTED", "target_name": "ci", "connection_name": "master"} */
create or replace temporary table "_CI_PR_321"."ELEMENTARY"."DBT_COLUMNS__tmp_20240216134429758497"
as
(
SELECT
metadata_hash
FROM "_CI_PR_321"."ELEMENTARY"."DBT_COLUMNS"
WHERE 1 = 0
);
13:44:30 SQL status: SUCCESS 1 in 0.0 seconds
13:44:30 Using snowflake connection "master"
13:44:30 On master: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "REDACTED", "target_name": "ci", "connection_name": "master"} */
describe table "_CI_PR_321"."ELEMENTARY"."DBT_COLUMNS__tmp_20240216134429758497"
13:44:30 SQL status: SUCCESS 1 in 0.0 seconds
13:44:30 Elementary: Inserting 17898 rows to table "_CI_PR_321"."ELEMENTARY"."DBT_COLUMNS__tmp_20240216134429758497"
13:44:46 Elementary: [1/1] Running insert query.
13:44:47 Using snowflake connection "master"
13:44:47 On master: /* {"app": "dbt", "dbt_version": "1.7.8", "profile_name": "REDACTED", "target_name": "ci", "connection_name": "master"} */
insert into "_CI_PR_321"."ELEMENTARY"."DBT_COLUMNS__tmp_20240216134429758497"
(METADATA_HASH) values
('000252a5f1b9e1450a378b8ea3b891d1'),('000463c272cb34d9b52a3a18925d4e18'), ... <17,898 values>
13:44:48 Snowflake adapter: Snowflake query id: 01b263d8-0708-7b02-0000-44ed063f6a5a
13:44:48 Snowflake adapter: Snowflake error: 001795 (42601): SQL compilation error: error line 2 at position 25
maximum number of expressions in a list exceeded, expected at most 16,384, got 17,898
13:44:48 Database error while running on-run-end
13:44:48 On master: Close
13:44:48 Connection 'master' was properly closed.
13:44:48 Connection 'model.REDACTED.log_clean_stream_event' was properly closed.
13:44:48 Connection 'list__CI_PR_321_elementary' was properly closed.
13:44:48 Connection 'model.REDACTED.fct_device_activity_plus' was properly closed.
13:44:48 Connection 'model.REDACTED.int__fct_device_activity_plus' was properly closed.
13:44:48 Connection 'list__CI_PR_321_staging' was properly closed.
13:44:48 Connection 'list__CI_PR_321_intermediate' was properly closed.
13:44:48 Connection 'model.REDACTED.log_clean_site_event' was properly closed.
13:44:48
13:44:48 Finished running 3 view models, 1 incremental model, 1 hook in 0 hours 3 minutes and 39.69 seconds (219.69s).
13:44:48 Command end result
13:44:48
13:44:48 31mCompleted with 1 error and 0 warnings:
13:44:48
13:44:48 on-run-end failed, error:
001795 (42601): SQL compilation error: error line 2 at position 25
maximum number of expressions in a list exceeded, expected at most 16,384, got 17,898
To Reproduce Not sure how, as it seems to be specific to our Snowflake version.
Expected behavior
If the number of values exceed 16,384 elements for an insert statements, then multiple insert statements should be created and UNION ALL
together.
Environment (please complete the following information):
- edr Version: 0.14.0
- dbt package Version: 1.7.8
Hi @avishwakarma-tunein, I looked a bit into this issue with other people in our community, and this is what i know:
Context
This issue was seen a few times recently - example.
Seems like there is a limit to the amount of rows that can be inserted in one INSERT
query in some version of snowflake, we currently limit the size of our queries by default based on size, not row count, so we encounter this issue.
Workaround
Add the following vars to the dbt project to change this behavior:
vars:
insert_rows_method: chunk
dbt_artifacts_chunk_size: <number lower than 16,384>
Possible solutions
- Change default behavior in snowflake - might cause issues with queries that have a few rows but a lot of data in each
- Limit queries on size and row count - a little more tricky to implement
Hi @avishwakarma-tunein , Closing this issue since it's been open for over 3 months. If you feel this is still relevant and Ofek's suggestion didn't work for you, please feel free to re-open.