Wenchen Fan
Wenchen Fan
Can we split this PR into two? IIUC the DS v1 change can benefit file source tables immediately if `spark.sql.statistics.size.autoUpdate.enabled` is enabled. For the DS v2 part, do we support...
thanks, merging to master!
@EnricoMi sorry this PR is lost track. Have you addressed all the review comments?
Shall we revert this if https://github.com/apache/spark/pull/44976#discussion_r1630428579 is a real issue? I don't think this is a critical path for performance (how much parallelism do you expect for function lookups in...
I've sent out the revert PR: https://github.com/apache/spark/pull/46940
cc @scottsand-db
@felipepessoto thanks for providing the repro! What was the error you hit? And can you also post the result of `spark.sessionState.executePlan(plan).analyzed.treeString`?
one workaround is to set `spark.sql.legacy.useV1Command` to true. Ideally `DeltaCatalog` should not return views in `listTables`.
If the spark schema doesn't match the specified avro schema, what shall we do? And shall we allow compatible schema changing like int to long?
As a start, I think we can simply require the spark schema to be same as avro schema, while accepting namespace/field name difference.