datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

perf: Add option to replace SortMergeJoin with ShuffledHashJoin

Open andygrove opened this issue 1 year ago • 7 comments

Which issue does this PR close?

Closes https://github.com/apache/datafusion-comet/issues/1006

Rationale for this change

Improved performance

What changes are included in this PR?

  • Add new config option to replace SMJ with SHJ

How are these changes tested?

I manually ran TPC-H and saw improved performance. I will post benchmarks once I have run more tests.

andygrove avatar Oct 09 '24 17:10 andygrove

Here is a teaser for the performance improvement. This is for TPC-H q11 (SF=100) with broadcast joins disabled (I am looking into a regression with those). I ran the query 5 times each with rule enabled vs disabled.

Rule Off

79.87537693977356,
77.76734256744385,
75.35734295845032,
75.44863200187683,
72.88174152374268

Rule On

39.33945274353027,
36.159271240234375,
35.83299708366394,
35.638232707977295,
35.67777371406555

andygrove avatar Oct 09 '24 18:10 andygrove

There is a small danger in enabling this without having a good estimate of the size of the build side. ShuffleHashJoin has limits on how much data it can process efficiently. If the build side hash table has no spilling then a large enough build side will cause OOMs and if there is spilling, then SMJ can frequently lead to better performance. We might even see this when we scale the benchmark from SF1 to say SF10. Is there a way for us to get cardinality and row size for the build side somehow? Still worth adding this option though.

parthchandra avatar Oct 09 '24 20:10 parthchandra

if there is spilling, then SMJ can frequently lead to better performance I have seen this happen with Spark with some TPC-DS queries at SF10.

parthchandra avatar Oct 09 '24 20:10 parthchandra

Current benchmarks:

tpch_allqueries

Speedup of using HashJoin instead of SortMergeJoin:

tpch_queries_speedup

andygrove avatar Oct 09 '24 21:10 andygrove

Codecov Report

:x: Patch coverage is 19.44444% with 29 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 34.27%. Comparing base (e3ac6cf) to head (1073517). :warning: Report is 748 commits behind head on main.

Files with missing lines Patch % Lines
...ain/scala/org/apache/comet/rules/RewriteJoin.scala 0.00% 26 Missing :warning:
...org/apache/comet/CometSparkSessionExtensions.scala 40.00% 2 Missing and 1 partial :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1007      +/-   ##
============================================
- Coverage     34.41%   34.27%   -0.14%     
+ Complexity      886      881       -5     
============================================
  Files           112      113       +1     
  Lines         43479    43514      +35     
  Branches       9656     9663       +7     
============================================
- Hits          14962    14914      -48     
- Misses        25442    25510      +68     
- Partials       3075     3090      +15     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov-commenter avatar Oct 09 '24 23:10 codecov-commenter

I will add documentation to this PR today, explaining pros/cons of this feature in our tuning guide.

andygrove avatar Oct 10 '24 15:10 andygrove

@viirya @parthchandra This is now ready for review. The new option is disabled by default and I added a section to the tuning guide explaining why users may want to enable this new option.

andygrove avatar Oct 10 '24 16:10 andygrove

I have run into a deadlock when running TPC-DS benchmarks with this feature, so I am moving to draft while I investigate. It is possibly related to the memory pool issues that we are also working on in other PRs.

andygrove avatar Oct 11 '24 01:10 andygrove

After upmerging, I no longer see the deadlock, but instead get an error if I have insufficient memory allocated, which is an improvement.

org.apache.comet.CometNativeException (External error: Internal error: 

Partition is still not able to allocate enough memory for the array builders after spilling..

However, when I increase memory, I see queries fail due to https://github.com/apache/datafusion-comet/issues/1019.

andygrove avatar Oct 15 '24 14:10 andygrove

I have now marked the feature as experimental and explained in the tuning guide that there is no spill to disk so this could result in OOM.

andygrove avatar Oct 18 '24 20:10 andygrove

Fresh benchmarks after upmerging.

tpch_allqueries

andygrove avatar Oct 19 '24 18:10 andygrove

TPC-DS excluding q97 (OOM with ShuffledHashJoin).

tpcds_allqueries

andygrove avatar Oct 20 '24 00:10 andygrove