Andy Grove issues

Results 317 issues of


                                            Andy Grove

Push projections into hash joins

### What is the problem the feature request solves? DataFusion has an optimization where projections can be pushed down into hash joins. This implemented in the projection pushdown optimizer rule...

enhancement

performance

Implement native parsing of JSON files

### What is the problem the feature request solves? We can probably accelerate reading of JSON files by continuing to use JVM Spark to read bytes from disk but then...

enhancement

performance

Null pointer when spark.comet.parquet.enable.directBuffer is enabled

### Describe the bug I was experimenting with enabling `spark.comet.parquet.enable.directBuffer` and this happened: ``` Caused by: org.apache.spark.SparkException: Encountered error while reading file file:///mnt/bigdata/tpcds/sf100/inventory.parquet/part1.parquet. Details: at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotReadFilesError(QueryExecutionErrors.scala:877) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:307) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:125)...

bug

Stale

Implement physical optimizer rule for common subexpression elimination

### Is your feature request related to a problem or challenge? When running TPC-H q1 in Spark + DataFusion Comet, the expression `l_extendedprice#21 * (1 - l_discount#22)` appears twice in...

enhancement

Andy Grove

Push projections into hash joins

Implement native parsing of JSON files

Null pointer when spark.comet.parquet.enable.directBuffer is enabled

Upgrade DataFusion from 31.0.0 to 40.1.0

`join-datagen.R` expects 4 arguments but only 3 are provided in `format_and_mount` script

Add option to FilterExec to prevent re-using input batches

Implement physical optimizer rule for common subexpression elimination