datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Fix inconsistencies in operator configs for enabling/disabling/allowing incompatible

Open andygrove opened this issue 1 month ago • 2 comments

What is the problem the feature request solves?

Expressions have consistent configs for enabling, disabling, and allowing incompatible uses. This is because CometExpressionSerde has the following method:

def getExprConfigName(expr: T): String = expr.getClass.getSimpleName

Config names are built from this:

def getExprEnabledConfigKey(name: String): String = {
  s"${CometConf.COMET_EXPR_CONFIG_PREFIX}.$name.enabled"
}
def getExprAllowIncompatConfigKey(name: String): String = {
  s"${CometConf.COMET_EXPR_CONFIG_PREFIX}.$name.allowIncompatible"
}

Unfortunately, operators followed a different pattern, where explicit configs were implemented:

val COMET_EXEC_GLOBAL_LIMIT_ENABLED: ConfigEntry[Boolean] =
  createExecEnabledConfig("globalLimit", defaultValue = true)
val COMET_EXEC_BROADCAST_HASH_JOIN_ENABLED: ConfigEntry[Boolean] =
  createExecEnabledConfig("broadcastHashJoin", defaultValue = true)

Once https://github.com/apache/datafusion-comet/pull/2741 is merged, there will be different configs for enabling/disabling versus allowing incompatible. For example:

spark.comet.exec.globalLimit.enabled=true
spark.comet.exec.GlobalLimitExec.allowIncompatible=true

We can't simply rename the "enable" configs because that would be a breaking change for users.

I propose that we update CometOperatorSerde to match the CometExpressionSerde approach, but also build in support for respecting the old / deprecated config as well.

Describe the potential solution

No response

Additional context

No response

andygrove avatar Nov 09 '25 16:11 andygrove

I propose adding these methods to CometOperatorSerde:


def getExecConfigName(expr: T): String = expr.getClass.getSimpleName.replace("Exec", "")

def getExecLegacyConfigName(expr: T): Option[String] = None

The current enabledConfig method can be removed:

def enabledConfig: Option[ConfigEntry[Boolean]]

Next, QueryPlanSerde.isOperatorEnabled can be updated as needed to use the new methods.

andygrove avatar Nov 09 '25 16:11 andygrove

GenerateDocs will also need updating

andygrove avatar Nov 09 '25 16:11 andygrove