datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

WIP: Fix performance regression with `stddev` being enabled by default

Open andygrove opened this issue 6 months ago • 1 comments

Which issue does this PR close?

Closes #.

Rationale for this change

Fix a performance regression and simplify configs for enabling operators and expressions

tpcds_allqueries

We now fall back to Spark for stddev_sample aggregates.

image

What changes are included in this PR?

  • stddev is now disabled by default. There was a recent regression related to configs that had enabled this by default, and this operation is much slower in DataFusion than in Spark.
  • Remove spark.comet.exec.all.enabled and enable all operators by default. Each operator can be disabled individually be changing it's enabled config to false. These are all documented.

How are these changes tested?

Existing tests

andygrove avatar Aug 18 '24 13:08 andygrove