Andy Grove
Andy Grove
I created an epic in Comet to track donating the remaining Spark expressions: https://github.com/apache/datafusion-comet/issues/2084
I just re-tested this, and it is still an issue even after switching to the new engine. ``` scala> spark.read.json("no-body.json").show 24/01/17 00:02:09 WARN GpuOverrides: !Exec cannot run on GPU because...
This only seems to be an issue for a JSON file that only contains empty entries. If there is at least one non-empty row, then we match Spark. ``` $...
I'm testing this PR out now, in conjunction with some other PRs because I currently have a reproducible deadlock caused by memory pool issues, as far as I can tell.
Can we fallback to Spark (or another reader) for now if any of these configs are set to values that we do not yet honor?
Thanks for writing this up @SemyonSinchenko. Your reasoning seems sound to me, and I agree that is would be quite a unique and powerful feature for Comet. I am not...
> It seems to me that re-using how spark handle python UDFs would be easier than implementing it from scratch using datafusion. But I'm not 100% sure. Yes, that is...
@SemyonSinchenko I started hacking on a solution for this in https://github.com/andygrove/datafusion-comet/tree/pyarrow-wip by adding a CometArrowPythonRunner. I plan on working on this as a low-priority task when I have time, but...
Thanks @lewiszlw. Could you rebase/upmerge (to fix the CI failure) then I can review
These Spark SQL test failures with `native_iceberg_compat` are possible related to this issue: - Spark native readers should respect spark.sql.caseSensitive - parquet *** FAILED *** (440 milliseconds) - SPARK-31116: Select...