Andy Grove

Results 438 issues of Andy Grove

### What is the problem the feature request solves? DataFusion has an optimization where projections can be pushed down into hash joins. This implemented in the projection pushdown optimizer rule...

enhancement
performance

### What is the problem the feature request solves? We can probably accelerate reading of JSON files by continuing to use JVM Spark to read bytes from disk but then...

enhancement
performance

### Describe the bug I was experimenting with enabling `spark.comet.parquet.enable.directBuffer` and this happened: ``` Caused by: org.apache.spark.SparkException: Encountered error while reading file file:///mnt/bigdata/tpcds/sf100/inventory.parquet/part1.parquet. Details: at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotReadFilesError(QueryExecutionErrors.scala:877) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:307) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:125)...

bug

Here is a part of `format_and_mount.sh`: ``` echo "Creating 500mb join datasets" Rscript ../_data/join-datagen.R 1e7 0 0 Rscript ../_data/join-datagen.R 1e7 5 0 Rscript ../_data/join-datagen.R 1e7 0 1 ``` This fails:...

## Which issue does this PR close? N/A ## Rationale for this change DataFusion Comet is currently maintaining a fork of FilterExec with a small modificiation to change the way...

physical-expr
Stale

### Is your feature request related to a problem or challenge? When running TPC-H q1 in Spark + DataFusion Comet, the expression `l_extendedprice#21 * (1 - l_discount#22)` appears twice in...

enhancement

## Which issue does this PR close? Follows on from https://github.com/apache/datafusion-comet/pull/1744 Diff between this PR and https://github.com/apache/datafusion-comet/pull/1744: https://github.com/andygrove/datafusion-comet/compare/scan-refactor-2...scan-refactor-3 ## Rationale for this change This PR adds a `scanImpl` attribute to...

## Which issue does this PR close? N/A ## Rationale for this change Fixing technical debt in preparation for other improvements for native scans and complex type support ## What...

### What is the problem the feature request solves? When switching from the default profile to the spark-4.0 profile (and the jdk-17) profile, I ran into various issues with building...

enhancement
good first issue