Fix inconsistencies in operator configs for enabling/disabling/allowing incompatible
What is the problem the feature request solves?
Expressions have consistent configs for enabling, disabling, and allowing incompatible uses. This is because CometExpressionSerde has the following method:
def getExprConfigName(expr: T): String = expr.getClass.getSimpleName
Config names are built from this:
def getExprEnabledConfigKey(name: String): String = {
s"${CometConf.COMET_EXPR_CONFIG_PREFIX}.$name.enabled"
}
def getExprAllowIncompatConfigKey(name: String): String = {
s"${CometConf.COMET_EXPR_CONFIG_PREFIX}.$name.allowIncompatible"
}
Unfortunately, operators followed a different pattern, where explicit configs were implemented:
val COMET_EXEC_GLOBAL_LIMIT_ENABLED: ConfigEntry[Boolean] =
createExecEnabledConfig("globalLimit", defaultValue = true)
val COMET_EXEC_BROADCAST_HASH_JOIN_ENABLED: ConfigEntry[Boolean] =
createExecEnabledConfig("broadcastHashJoin", defaultValue = true)
Once https://github.com/apache/datafusion-comet/pull/2741 is merged, there will be different configs for enabling/disabling versus allowing incompatible. For example:
spark.comet.exec.globalLimit.enabled=true
spark.comet.exec.GlobalLimitExec.allowIncompatible=true
We can't simply rename the "enable" configs because that would be a breaking change for users.
I propose that we update CometOperatorSerde to match the CometExpressionSerde approach, but also build in support for respecting the old / deprecated config as well.
Describe the potential solution
No response
Additional context
No response
I propose adding these methods to CometOperatorSerde:
def getExecConfigName(expr: T): String = expr.getClass.getSimpleName.replace("Exec", "")
def getExecLegacyConfigName(expr: T): Option[String] = None
The current enabledConfig method can be removed:
def enabledConfig: Option[ConfigEntry[Boolean]]
Next, QueryPlanSerde.isOperatorEnabled can be updated as needed to use the new methods.
GenerateDocs will also need updating