[complex types] Improve "unsupported schema" message
Describe the bug
I tried testing complex type support from the Spark shell, by setting COMET_PARQUET_SCAN_IMPL=native_datafusion but I could not read a Parquet file containing structs and arrays.
The error message says "Unsupported schema" which isn't a great user experience. It would be better if we could show the specific reason why the schema isn't supported (in this case, I had not enabled COMET_SCAN_ALLOW_INCOMPATIBLE).
scala> spark.sql("select * from t1").show
25/04/05 09:36:50 WARN CometSparkSessionExtensions$CometExecRule: Comet cannot execute some parts of this plan natively (set spark.comet.explainFallback.enabled=false to disable this logging):
CollectLimit [COMET: CollectLimit is not supported]
+- Project [COMET: Project is not native because the following children are not native (Scan parquet )]
+- Scan parquet [COMET: Unsupported schema StructType(StructField(c0,BooleanType,true),StructField(c1,ByteType,true),...
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
Hi @andygrove do you have a reproduce case, or at least a schema. native-datafusion reader should handle different kind of array/struct combintations
Hi @andygrove do you have a reproduce case, or at least a schema.
native-datafusionreader should handle different kind of array/struct combintations
This is the reason I was getting the error:
private def isGloballySupported(dt: DataType): Boolean = dt match {
case ByteType | ShortType
if CometSparkSessionExtensions.usingDataFusionParquetExec(SQLConf.get) &&
!CometConf.COMET_SCAN_ALLOW_INCOMPATIBLE.get() =>
false
Creating a struct or array containing byte or short should reproduce the issue.
Fixed in https://github.com/apache/datafusion-comet/pull/1667