datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

[complex types] Improve "unsupported schema" message

Open andygrove opened this issue 8 months ago • 2 comments

Describe the bug

I tried testing complex type support from the Spark shell, by setting COMET_PARQUET_SCAN_IMPL=native_datafusion but I could not read a Parquet file containing structs and arrays.

The error message says "Unsupported schema" which isn't a great user experience. It would be better if we could show the specific reason why the schema isn't supported (in this case, I had not enabled COMET_SCAN_ALLOW_INCOMPATIBLE).

scala> spark.sql("select * from t1").show
25/04/05 09:36:50 WARN CometSparkSessionExtensions$CometExecRule: Comet cannot execute some parts of this plan natively (set spark.comet.explainFallback.enabled=false to disable this logging):
 CollectLimit [COMET: CollectLimit is not supported]
+-  Project [COMET: Project is not native because the following children are not native (Scan parquet )]
   +-  Scan parquet  [COMET: Unsupported schema StructType(StructField(c0,BooleanType,true),StructField(c1,ByteType,true),...

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

andygrove avatar Apr 05 '25 16:04 andygrove

Hi @andygrove do you have a reproduce case, or at least a schema. native-datafusion reader should handle different kind of array/struct combintations

comphead avatar Apr 08 '25 22:04 comphead

Hi @andygrove do you have a reproduce case, or at least a schema. native-datafusion reader should handle different kind of array/struct combintations

This is the reason I was getting the error:

  private def isGloballySupported(dt: DataType): Boolean = dt match {
    case ByteType | ShortType
        if CometSparkSessionExtensions.usingDataFusionParquetExec(SQLConf.get) &&
          !CometConf.COMET_SCAN_ALLOW_INCOMPATIBLE.get() =>
      false

Creating a struct or array containing byte or short should reproduce the issue.

andygrove avatar Apr 08 '25 22:04 andygrove

Fixed in https://github.com/apache/datafusion-comet/pull/1667

andygrove avatar Jun 16 '25 18:06 andygrove