tpch
tpch copied to clipboard
Why is the Spark memory set to 2gb?
Here's the line: https://github.com/pola-rs/tpch/blob/6c5bbe93a04cfcd25678dd860bab5ad61ad66edb/queries/pyspark/utils.py#L24
If these benchmarks are being run on a single node, we should probably set the shuffle partitions to be like 1-4 instead of 200 (which is the default).