Andy Grove comments

Results 657 comments of


                                            Andy Grove

Investigate CoalescedHashPartitioning

> See [apache/spark@81639090622](https://github.com/apache/spark/commit/81639090622) for changes that were needed to the CPU BroadcastHashJoinExec that are probably relevant to the changes likely needed for the GPU version. This commit updated the `outputPartitioning`...

fix: Fall back to Spark for MakeDecimal with unsupported input type

> are we waiting to address any feedback on this PR? I think I addressed all of the feedback from @martin-g

fix: Fall back to Spark for MakeDecimal with unsupported input type

> Thanks @andygrove and @martin-g for the review. > > I feel this PR is good as it has the consistent issue before the PR and after PR it is...

feat: Implement shared memory pool for case where spark.memory.offHeap.enabled=false

> I'm a bit worried about this approach because we are implementing greedy mode inside `CometTaskMemoryManager`, which is known to starve consumers frequently. I prefer using fair spill pool for...

feat: Implement shared memory pool for case where spark.memory.offHeap.enabled=false

Closing in favor of https://github.com/apache/datafusion-comet/pull/1021

Fall back to Spark if query uses DPP to avoid perf regressions in TPC-DS

This is resolved for v1 data sources but not for v2.

Possible native shuffle optimization

~I have been learning more about Spark shuffle and now understand why this issue does not make sense.~ edit: I thought I understood this, but now I am not so...

Possible native shuffle optimization

Useful reference info: https://medium.com/@philipp.brunenberg/understanding-apache-spark-shuffle-85644d90c8c6

Possible native shuffle optimization

It seems that ShuffleWriterExec is invoked by ShuffleMapTask which handles reading the input RDD data, so we cannot override this mechanism easily.

Improve performance of Spark-compatible decimal aggregates

Related upstream changes in arrow-rs: https://github.com/apache/arrow-rs/pull/6419