Andy Grove
Andy Grove
### What is the problem the feature request solves? DataFusion has an optimization where projections can be pushed down into hash joins. This implemented in the projection pushdown optimizer rule...
### What is the problem the feature request solves? We can probably accelerate reading of JSON files by continuing to use JVM Spark to read bytes from disk but then...
### Describe the bug I was experimenting with enabling `spark.comet.parquet.enable.directBuffer` and this happened: ``` Caused by: org.apache.spark.SparkException: Encountered error while reading file file:///mnt/bigdata/tpcds/sf100/inventory.parquet/part1.parquet. Details: at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotReadFilesError(QueryExecutionErrors.scala:877) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:307) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:125)...
Here is a part of `format_and_mount.sh`: ``` echo "Creating 500mb join datasets" Rscript ../_data/join-datagen.R 1e7 0 0 Rscript ../_data/join-datagen.R 1e7 5 0 Rscript ../_data/join-datagen.R 1e7 0 1 ``` This fails:...
## Which issue does this PR close? N/A ## Rationale for this change DataFusion Comet is currently maintaining a fork of FilterExec with a small modificiation to change the way...
### Is your feature request related to a problem or challenge? When running TPC-H q1 in Spark + DataFusion Comet, the expression `l_extendedprice#21 * (1 - l_discount#22)` appears twice in...
## Which issue does this PR close? Follows on from https://github.com/apache/datafusion-comet/pull/1744 Diff between this PR and https://github.com/apache/datafusion-comet/pull/1744: https://github.com/andygrove/datafusion-comet/compare/scan-refactor-2...scan-refactor-3 ## Rationale for this change This PR adds a `scanImpl` attribute to...
## Which issue does this PR close? N/A ## Rationale for this change Fixing technical debt in preparation for other improvements for native scans and complex type support ## What...
### What is the problem the feature request solves? When switching from the default profile to the spark-4.0 profile (and the jdk-17) profile, I ran into various issues with building...