ClickBench icon indicating copy to clipboard operation
ClickBench copied to clipboard

Snowflake CSV - use parallel scan settings for uncompressed CSV

Open sfc-gh-xhuang opened this issue 8 months ago • 1 comments

I noticed that for duckdb the CSV file is first uncompressed before loading. https://github.com/ClickHouse/ClickBench/blob/main/duckdb/benchmark.sh#L18C1-L18C5

For Snowflake, we also support faster parallel scanning of uncompressed CSVs.

Can we modify the Snowflake data loading test such that it loads a uncompressed CSV with MULTI_LINE=FALSE and COMPRESSION=NONE?

https://medium.com/snowflake/recap-of-snowflake-ingestion-cost-and-performance-improvements-large-csv-demo-911e6588d626?source=friends_link&sk=38a754b71aa06f51f269c6974b24abd8

sfc-gh-xhuang avatar Jun 17 '25 21:06 sfc-gh-xhuang

Yes, please send a PR, ClickBench is a community project.

rschu1ze avatar Jun 17 '25 21:06 rschu1ze