datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Add dynamic filter (bounds) pushdown to HashJoinExec

Open adriangb opened this issue 6 months ago โ€ข 4 comments

Part of #7955.

My goal here is to lay the groundwork for pushing down joins. I am only implementing bounds pushdown because I am sure that is cheap and it will probably be quite effective in many cases. And it will be ~ easy to push down a reference to the whole hash table in a followup PR.

Another followup that can be done is to enable parent filter pushdown through HashJoinExec. Similar to FilterExec this requires adjusting parent filters for the join's projection, but we also need to check what columns each filter refers to to push it into the correct child (or not push it if it refers to columns from both children and can't be disjoint).

adriangb avatar Jun 18 '25 15:06 adriangb

I tink we should also consider a heuristic for not evaluating the filter if it's not useful.

Also I think doing only the lookup is preferable above also computing / checking the bounds, I think the latter might create more overhead.

Dandandan avatar Jun 18 '25 15:06 Dandandan

Sorry, misclicked a button.

Dandandan avatar Jun 18 '25 15:06 Dandandan

I think doing only the lookup is preferable above also computing / checking the bounds, I think the latter might create more overhead

My thought was that for some cases the bounds checks are going to be quite effective at pruning and they should always be cheap to compute and cheap to apply. I'm surprised you say that they might create a lot of overhead?

adriangb avatar Jun 18 '25 15:06 adriangb

I think doing only the lookup is preferable above also computing / checking the bounds, I think the latter might create more overhead

My thought was that for some cases the bounds checks are going to be quite effective at pruning and they should always be cheap to compute and cheap to apply. I'm surprised you say that they might create a lot of overhead?

Maybe I should articulate it a bit more.

  • If we are only filtering out based on statistics, min/max might make sense to quickly filter out large chunks of rows.
  • If we are filtering on values (e.g. filter pushdown) - I think it makes sense to only filter on the shared hashmap and not bothering with the min/max values - creating hashes and doing a single table lookup is quite fast, so I think we want to avoid to also evaluate the min/max expression (at least for all rows).

I think it also makes sense to also thing about a heuristic we want to use to use this pushdown only when we think it might be useful - e.g. the left side is much smaller than the right side, or we know (based on column statistics) it will filter out rows.

Dandandan avatar Jun 18 '25 22:06 Dandandan

I think it makes sense to only filter on the shared hashmap and not bothering with the min/max values - creating hashes and doing a single table lookup is quite fast, so I think we want to avoid to also evaluate the min/max expression (at least for all rows)

I'm surprised that the hash table lookup, even if O(1), has such a small constant factor that its ~ a couple of binary comparisons. That said a reason to still do both is stats and filter caching: simple filters like col >= 123 and col <= 456 can be used for stats pruning and can easily be cached (for example for filter caching based indexing). So even if performance is not strictly better there is still something to be said for including a simple filter in addition to the hash table lookup.

adriangb avatar Jun 18 '25 23:06 adriangb

I think it also makes sense to also thing about a heuristic we want to use to use this pushdown only when we think it might be useful - e.g. the left side is much smaller than the right side, or we know (based on column statistics) it will filter out rows

Datafusion is generally not great at these things: we often don't have enough stats / info to make decisions like this.

adriangb avatar Jun 19 '25 20:06 adriangb

I think it makes sense to only filter on the shared hashmap and not bothering with the min/max values - creating hashes and doing a single table lookup is quite fast, so I think we want to avoid to also evaluate the min/max expression (at least for all rows)

I'm surprised that the hash table lookup, even if O(1), has such a small constant factor that its ~ a couple of binary comparisons. That said a reason to still do both is stats and filter caching: simple filters like col >= 123 and col <= 456 can be used for stats pruning and can easily be cached (for example for filter caching based indexing). So even if performance is not strictly better there is still something to be said for including a simple filter in addition to the hash table lookup.

It's hard to say generally, but a hashtable lookup which fits into cache on a u64 key can be really fast.

Dandandan avatar Jun 20 '25 13:06 Dandandan

It's hard to say generally, but a hashtable lookup which fits into cache on a u64 key can be really fast.

I guess only benchmarks can tell. But I still think the scalar bounds are worth keeping for stats pruning reasons.

adriangb avatar Jun 20 '25 13:06 adriangb

To share some experience, we recently added some similar pushdown for HashJoinExec (at Coralogix) using sharing of Arc<JoinLeftData> / comparing column hashes and it is seems so far very effective with predicate pushdown enabled.

Dandandan avatar Jun 24 '25 20:06 Dandandan

I was originally planning on keeping this PR smaller but it's been growing so I might as well add the Arc<LeftData> :)

adriangb avatar Jun 24 '25 20:06 adriangb

I was originally planning on keeping this PR smaller but it's been growing so I might as well add the Arc :)

Feel free to PR it however you like ;)

Dandandan avatar Jun 24 '25 21:06 Dandandan

@Dandandan any chance you'd be willing to contribute your implementation of sharing Arc<LeftData> so we use something we know is working / I don't have to re-invent the wheel? I think you can just push it to this branch.

adriangb avatar Jun 25 '25 12:06 adriangb

@alamb I'd be interested to see what benchmarks say if you don't mind kicking them off?

adriangb avatar Jun 26 '25 00:06 adriangb

@alamb I'd be interested to see what benchmarks say if you don't mind kicking them off?

IIRC, the optimization will speed up tpch benchmark, we may run it directly. Or directly construct a small table and probe big table to see the effect.

xudong963 avatar Jun 26 '25 07:06 xudong963

@alamb ping for benchmarks run ๐Ÿ™๐Ÿป

adriangb avatar Jun 27 '25 17:06 adriangb

๐Ÿค– ./gh_compare_branch.sh Benchmark Script Running Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP Thu Apr 24 20:41:05 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing hash-join-pushdown (ab8b2e6315a3352b31d3b90da4c7bb37dbce5b70) to 59143c144d0f11265b7e33421fd6c3733b6ae26e diff Benchmarks: tpch_mem clickbench_partitioned clickbench_extended Results will be posted here when complete

alamb avatar Jun 27 '25 18:06 alamb

@Dandandan any chance you'd be willing to contribute your implementation of sharing Arc<LeftData> so we use something we know is working / I don't have to re-invent the wheel? I think you can just push it to this branch.

I'll try to have a look soon but don't know when I'll have some space for it yet!

Dandandan avatar Jun 27 '25 19:06 Dandandan

๐Ÿค–: Benchmark completed

Details

Comparing HEAD and hash-join-pushdown
--------------------
Benchmark clickbench_extended.json
--------------------
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Query        โ”ƒ        HEAD โ”ƒ hash-join-pushdown โ”ƒ       Change โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ QQuery 0     โ”‚  1962.99 ms โ”‚         1960.97 ms โ”‚    no change โ”‚
โ”‚ QQuery 1     โ”‚   688.10 ms โ”‚          735.65 ms โ”‚ 1.07x slower โ”‚
โ”‚ QQuery 2     โ”‚  1332.88 ms โ”‚         1416.49 ms โ”‚ 1.06x slower โ”‚
โ”‚ QQuery 3     โ”‚   657.85 ms โ”‚          653.72 ms โ”‚    no change โ”‚
โ”‚ QQuery 4     โ”‚  1371.75 ms โ”‚         1356.13 ms โ”‚    no change โ”‚
โ”‚ QQuery 5     โ”‚ 15278.08 ms โ”‚        15524.22 ms โ”‚    no change โ”‚
โ”‚ QQuery 6     โ”‚  2017.37 ms โ”‚         2082.79 ms โ”‚    no change โ”‚
โ”‚ QQuery 7     โ”‚  2063.13 ms โ”‚         2021.54 ms โ”‚    no change โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Benchmark Summary                 โ”ƒ            โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Total Time (HEAD)                 โ”‚ 25372.15ms โ”‚
โ”‚ Total Time (hash-join-pushdown)   โ”‚ 25751.52ms โ”‚
โ”‚ Average Time (HEAD)               โ”‚  3171.52ms โ”‚
โ”‚ Average Time (hash-join-pushdown) โ”‚  3218.94ms โ”‚
โ”‚ Queries Faster                    โ”‚          0 โ”‚
โ”‚ Queries Slower                    โ”‚          2 โ”‚
โ”‚ Queries with No Change            โ”‚          6 โ”‚
โ”‚ Queries with Failure              โ”‚          0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
--------------------
Benchmark clickbench_partitioned.json
--------------------
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Query        โ”ƒ        HEAD โ”ƒ hash-join-pushdown โ”ƒ        Change โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ QQuery 0     โ”‚     2.18 ms โ”‚            2.26 ms โ”‚     no change โ”‚
โ”‚ QQuery 1     โ”‚    33.15 ms โ”‚           34.35 ms โ”‚     no change โ”‚
โ”‚ QQuery 2     โ”‚    83.08 ms โ”‚           80.97 ms โ”‚     no change โ”‚
โ”‚ QQuery 3     โ”‚    94.65 ms โ”‚           98.73 ms โ”‚     no change โ”‚
โ”‚ QQuery 4     โ”‚   640.05 ms โ”‚          599.48 ms โ”‚ +1.07x faster โ”‚
โ”‚ QQuery 5     โ”‚   889.72 ms โ”‚          864.15 ms โ”‚     no change โ”‚
โ”‚ QQuery 6     โ”‚     2.25 ms โ”‚            2.33 ms โ”‚     no change โ”‚
โ”‚ QQuery 7     โ”‚    37.91 ms โ”‚           37.43 ms โ”‚     no change โ”‚
โ”‚ QQuery 8     โ”‚   857.21 ms โ”‚          866.70 ms โ”‚     no change โ”‚
โ”‚ QQuery 9     โ”‚  1183.15 ms โ”‚         1190.70 ms โ”‚     no change โ”‚
โ”‚ QQuery 10    โ”‚   253.21 ms โ”‚          268.22 ms โ”‚  1.06x slower โ”‚
โ”‚ QQuery 11    โ”‚   289.40 ms โ”‚          289.43 ms โ”‚     no change โ”‚
โ”‚ QQuery 12    โ”‚   875.27 ms โ”‚          902.01 ms โ”‚     no change โ”‚
โ”‚ QQuery 13    โ”‚  1244.98 ms โ”‚         1293.02 ms โ”‚     no change โ”‚
โ”‚ QQuery 14    โ”‚   804.00 ms โ”‚          838.79 ms โ”‚     no change โ”‚
โ”‚ QQuery 15    โ”‚   789.50 ms โ”‚          796.05 ms โ”‚     no change โ”‚
โ”‚ QQuery 16    โ”‚  1606.04 ms โ”‚         1650.66 ms โ”‚     no change โ”‚
โ”‚ QQuery 17    โ”‚  1586.37 ms โ”‚         1639.90 ms โ”‚     no change โ”‚
โ”‚ QQuery 18    โ”‚  2880.02 ms โ”‚         2949.30 ms โ”‚     no change โ”‚
โ”‚ QQuery 19    โ”‚    87.23 ms โ”‚           87.08 ms โ”‚     no change โ”‚
โ”‚ QQuery 20    โ”‚  1134.16 ms โ”‚         1184.12 ms โ”‚     no change โ”‚
โ”‚ QQuery 21    โ”‚  1280.50 ms โ”‚         1328.58 ms โ”‚     no change โ”‚
โ”‚ QQuery 22    โ”‚  2126.21 ms โ”‚         2200.67 ms โ”‚     no change โ”‚
โ”‚ QQuery 23    โ”‚  7372.53 ms โ”‚         7690.34 ms โ”‚     no change โ”‚
โ”‚ QQuery 24    โ”‚   459.72 ms โ”‚          482.54 ms โ”‚     no change โ”‚
โ”‚ QQuery 25    โ”‚   393.66 ms โ”‚          405.90 ms โ”‚     no change โ”‚
โ”‚ QQuery 26    โ”‚   523.87 ms โ”‚          534.28 ms โ”‚     no change โ”‚
โ”‚ QQuery 27    โ”‚  1527.62 ms โ”‚         1565.71 ms โ”‚     no change โ”‚
โ”‚ QQuery 28    โ”‚ 11958.57 ms โ”‚        11946.80 ms โ”‚     no change โ”‚
โ”‚ QQuery 29    โ”‚   526.30 ms โ”‚          525.65 ms โ”‚     no change โ”‚
โ”‚ QQuery 30    โ”‚   779.74 ms โ”‚          810.76 ms โ”‚     no change โ”‚
โ”‚ QQuery 31    โ”‚   809.91 ms โ”‚          846.36 ms โ”‚     no change โ”‚
โ”‚ QQuery 32    โ”‚  2568.92 ms โ”‚         2529.35 ms โ”‚     no change โ”‚
โ”‚ QQuery 33    โ”‚  3216.34 ms โ”‚         3247.30 ms โ”‚     no change โ”‚
โ”‚ QQuery 34    โ”‚  3217.53 ms โ”‚         3308.66 ms โ”‚     no change โ”‚
โ”‚ QQuery 35    โ”‚  1255.14 ms โ”‚         1244.00 ms โ”‚     no change โ”‚
โ”‚ QQuery 36    โ”‚   124.75 ms โ”‚          123.15 ms โ”‚     no change โ”‚
โ”‚ QQuery 37    โ”‚    51.39 ms โ”‚           54.53 ms โ”‚  1.06x slower โ”‚
โ”‚ QQuery 38    โ”‚   121.36 ms โ”‚          121.71 ms โ”‚     no change โ”‚
โ”‚ QQuery 39    โ”‚   192.92 ms โ”‚          194.83 ms โ”‚     no change โ”‚
โ”‚ QQuery 40    โ”‚    42.47 ms โ”‚           42.25 ms โ”‚     no change โ”‚
โ”‚ QQuery 41    โ”‚    38.02 ms โ”‚           38.90 ms โ”‚     no change โ”‚
โ”‚ QQuery 42    โ”‚    33.82 ms โ”‚           32.66 ms โ”‚     no change โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Benchmark Summary                 โ”ƒ            โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Total Time (HEAD)                 โ”‚ 53994.84ms โ”‚
โ”‚ Total Time (hash-join-pushdown)   โ”‚ 54950.62ms โ”‚
โ”‚ Average Time (HEAD)               โ”‚  1255.69ms โ”‚
โ”‚ Average Time (hash-join-pushdown) โ”‚  1277.92ms โ”‚
โ”‚ Queries Faster                    โ”‚          1 โ”‚
โ”‚ Queries Slower                    โ”‚          2 โ”‚
โ”‚ Queries with No Change            โ”‚         40 โ”‚
โ”‚ Queries with Failure              โ”‚          0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
--------------------
Benchmark tpch_mem_sf1.json
--------------------
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Query        โ”ƒ      HEAD โ”ƒ hash-join-pushdown โ”ƒ       Change โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ QQuery 1     โ”‚ 102.11 ms โ”‚          100.28 ms โ”‚    no change โ”‚
โ”‚ QQuery 2     โ”‚  20.16 ms โ”‚           21.52 ms โ”‚ 1.07x slower โ”‚
โ”‚ QQuery 3     โ”‚  33.16 ms โ”‚           33.76 ms โ”‚    no change โ”‚
โ”‚ QQuery 4     โ”‚  18.97 ms โ”‚           19.02 ms โ”‚    no change โ”‚
โ”‚ QQuery 5     โ”‚  50.37 ms โ”‚           51.37 ms โ”‚    no change โ”‚
โ”‚ QQuery 6     โ”‚  11.99 ms โ”‚           11.87 ms โ”‚    no change โ”‚
โ”‚ QQuery 7     โ”‚  88.26 ms โ”‚          100.63 ms โ”‚ 1.14x slower โ”‚
โ”‚ QQuery 8     โ”‚  24.72 ms โ”‚           26.26 ms โ”‚ 1.06x slower โ”‚
โ”‚ QQuery 9     โ”‚  54.05 ms โ”‚           57.56 ms โ”‚ 1.07x slower โ”‚
โ”‚ QQuery 10    โ”‚  43.38 ms โ”‚           43.87 ms โ”‚    no change โ”‚
โ”‚ QQuery 11    โ”‚  11.29 ms โ”‚           11.81 ms โ”‚    no change โ”‚
โ”‚ QQuery 12    โ”‚  34.83 ms โ”‚           40.14 ms โ”‚ 1.15x slower โ”‚
โ”‚ QQuery 13    โ”‚  26.11 ms โ”‚           26.58 ms โ”‚    no change โ”‚
โ”‚ QQuery 14    โ”‚   9.77 ms โ”‚           10.23 ms โ”‚    no change โ”‚
โ”‚ QQuery 15    โ”‚  19.70 ms โ”‚           19.28 ms โ”‚    no change โ”‚
โ”‚ QQuery 16    โ”‚  19.18 ms โ”‚           18.99 ms โ”‚    no change โ”‚
โ”‚ QQuery 17    โ”‚  96.32 ms โ”‚           95.92 ms โ”‚    no change โ”‚
โ”‚ QQuery 18    โ”‚ 198.10 ms โ”‚          221.52 ms โ”‚ 1.12x slower โ”‚
โ”‚ QQuery 19    โ”‚  25.57 ms โ”‚           26.41 ms โ”‚    no change โ”‚
โ”‚ QQuery 20    โ”‚  31.84 ms โ”‚           31.60 ms โ”‚    no change โ”‚
โ”‚ QQuery 21    โ”‚ 148.56 ms โ”‚          161.40 ms โ”‚ 1.09x slower โ”‚
โ”‚ QQuery 22    โ”‚  15.50 ms โ”‚           14.99 ms โ”‚    no change โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Benchmark Summary                 โ”ƒ           โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Total Time (HEAD)                 โ”‚ 1083.94ms โ”‚
โ”‚ Total Time (hash-join-pushdown)   โ”‚ 1145.03ms โ”‚
โ”‚ Average Time (HEAD)               โ”‚   49.27ms โ”‚
โ”‚ Average Time (hash-join-pushdown) โ”‚   52.05ms โ”‚
โ”‚ Queries Faster                    โ”‚         0 โ”‚
โ”‚ Queries Slower                    โ”‚         7 โ”‚
โ”‚ Queries with No Change            โ”‚        15 โ”‚
โ”‚ Queries with Failure              โ”‚         0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

alamb avatar Jun 27 '25 19:06 alamb

Looks like not much change, maybe even some slowdowns. I'll try to address https://github.com/apache/datafusion/pull/16445#discussion_r2154953337. Also probably worth breaking up this PR into enabling filters to pass through HashJoinExec (and make that implementation shared across join exec's) and a separate PR to enable HashJoinExec to produce dynamic filters.

adriangb avatar Jun 27 '25 19:06 adriangb

๐Ÿค–: Benchmark completed

Details

I think this is about to expect from these benchmarks and min/max pruning. Most of the data (all?) in TPC-H is randomly generated data with good distribution which makes min/max based pruning not effective as it can not prune any files. Clickbench doesn't have joins.

We probably have to minimize the overhead of computing min/max and see if we're not getting regressions on other benchmarks. Perhaps we should also skip if the build side is too large? The larger the build side the higher the overhead of computing min/max bounds and created fillters generally become less selective.

Perhaps a benchmark could be found/created that shows some more benefit with min/max pruning.

Dandandan avatar Jun 27 '25 20:06 Dandandan

We probably have to minimize the overhead of computing min/max and see if we're not getting regressions on other benchmarks. Perhaps we should also skip if the build side is too large? The larger the build side the higher the overhead of computing min/max bounds and created fillters generally become less selective.

Makes a lot of sense. I suppose if we build the min/max while we build the hash table it should be even lower cost.

adriangb avatar Jun 27 '25 21:06 adriangb

I think this is about to expect from these benchmarks and min/max pruning. Most of the data (all?) in TPC-H is randomly generated data with good distribution which makes min/max based pruning not effective as it can not prune any files. Clickbench doesn't have joins.

Ah masked sense. The use case that I really care about is joins on ULIDs where the cardinality is going to be very high and the build side may have 1M rows but the min/max bounds restrict it to a small fraction of the ULID space. I imagine it would be the same with SERIAL PKs and other cases with temporal correlations.

adriangb avatar Jun 27 '25 21:06 adriangb

I've pulled out part of this PR, the part about pushing filters down through HashJoinExec plus some new changes to the filter pushdown APIs into https://github.com/apache/datafusion/pull/16642

adriangb avatar Jul 02 '25 00:07 adriangb

Btw here's an article that explains how DuckDB does join filter pushdown. It sounds like they only push down min/max filters: https://duckdb.org/2024/09/09/announcing-duckdb-110.html#dynamic-filter-pushdown-from-joins

adriangb avatar Jul 02 '25 02:07 adriangb

@alamb could I ask you to kick off some benchmarks?

adriangb avatar Jul 31 '25 18:07 adriangb

I think I've addressed all of the feedback and rebased on main / changes broken out into other PRs. @Dandandan @xudong963 I've tagged you both for review.

@alamb would you mind kicking off benchmarks?

adriangb avatar Jul 31 '25 18:07 adriangb

๐Ÿค– ./gh_compare_branch.sh Benchmark Script Running Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing hash-join-pushdown (b135cd8614af0131d5e1484216b84499f37fe465) to d376a32f1f78d26a760c261878054bb1328800cc diff using: tpch_mem clickbench_partitioned clickbench_extended Results will be posted here when complete

alamb avatar Jul 31 '25 19:07 alamb

๐Ÿค–: Benchmark completed

Details

Comparing HEAD and hash-join-pushdown
--------------------
Benchmark clickbench_extended.json
--------------------
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Query        โ”ƒ        HEAD โ”ƒ hash-join-pushdown โ”ƒ        Change โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ QQuery 0     โ”‚  1936.73 ms โ”‚         1977.69 ms โ”‚     no change โ”‚
โ”‚ QQuery 1     โ”‚   762.69 ms โ”‚          763.75 ms โ”‚     no change โ”‚
โ”‚ QQuery 2     โ”‚  1481.87 ms โ”‚         1466.68 ms โ”‚     no change โ”‚
โ”‚ QQuery 3     โ”‚   687.14 ms โ”‚          648.45 ms โ”‚ +1.06x faster โ”‚
โ”‚ QQuery 4     โ”‚  1362.81 ms โ”‚         1405.27 ms โ”‚     no change โ”‚
โ”‚ QQuery 5     โ”‚ 14887.33 ms โ”‚        14948.71 ms โ”‚     no change โ”‚
โ”‚ QQuery 6     โ”‚  2040.21 ms โ”‚         2100.60 ms โ”‚     no change โ”‚
โ”‚ QQuery 7     โ”‚  1859.25 ms โ”‚         1911.89 ms โ”‚     no change โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Benchmark Summary                 โ”ƒ            โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Total Time (HEAD)                 โ”‚ 25018.03ms โ”‚
โ”‚ Total Time (hash-join-pushdown)   โ”‚ 25223.04ms โ”‚
โ”‚ Average Time (HEAD)               โ”‚  3127.25ms โ”‚
โ”‚ Average Time (hash-join-pushdown) โ”‚  3152.88ms โ”‚
โ”‚ Queries Faster                    โ”‚          1 โ”‚
โ”‚ Queries Slower                    โ”‚          0 โ”‚
โ”‚ Queries with No Change            โ”‚          7 โ”‚
โ”‚ Queries with Failure              โ”‚          0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
--------------------
Benchmark clickbench_partitioned.json
--------------------
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Query        โ”ƒ        HEAD โ”ƒ hash-join-pushdown โ”ƒ        Change โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ QQuery 0     โ”‚     2.56 ms โ”‚            2.22 ms โ”‚ +1.15x faster โ”‚
โ”‚ QQuery 1     โ”‚    35.52 ms โ”‚           33.65 ms โ”‚ +1.06x faster โ”‚
โ”‚ QQuery 2     โ”‚    83.47 ms โ”‚           82.33 ms โ”‚     no change โ”‚
โ”‚ QQuery 3     โ”‚    98.75 ms โ”‚           99.65 ms โ”‚     no change โ”‚
โ”‚ QQuery 4     โ”‚   589.45 ms โ”‚          609.21 ms โ”‚     no change โ”‚
โ”‚ QQuery 5     โ”‚   853.26 ms โ”‚          893.98 ms โ”‚     no change โ”‚
โ”‚ QQuery 6     โ”‚     2.30 ms โ”‚            2.23 ms โ”‚     no change โ”‚
โ”‚ QQuery 7     โ”‚    39.08 ms โ”‚           39.01 ms โ”‚     no change โ”‚
โ”‚ QQuery 8     โ”‚   846.57 ms โ”‚          866.28 ms โ”‚     no change โ”‚
โ”‚ QQuery 9     โ”‚  1192.91 ms โ”‚         1175.10 ms โ”‚     no change โ”‚
โ”‚ QQuery 10    โ”‚   257.51 ms โ”‚          260.38 ms โ”‚     no change โ”‚
โ”‚ QQuery 11    โ”‚   293.94 ms โ”‚          292.24 ms โ”‚     no change โ”‚
โ”‚ QQuery 12    โ”‚   877.37 ms โ”‚          896.65 ms โ”‚     no change โ”‚
โ”‚ QQuery 13    โ”‚  1238.00 ms โ”‚         1291.84 ms โ”‚     no change โ”‚
โ”‚ QQuery 14    โ”‚   826.31 ms โ”‚          833.51 ms โ”‚     no change โ”‚
โ”‚ QQuery 15    โ”‚   792.69 ms โ”‚          800.79 ms โ”‚     no change โ”‚
โ”‚ QQuery 16    โ”‚  1597.34 ms โ”‚         1619.89 ms โ”‚     no change โ”‚
โ”‚ QQuery 17    โ”‚  1573.03 ms โ”‚         1627.82 ms โ”‚     no change โ”‚
โ”‚ QQuery 18    โ”‚  2835.98 ms โ”‚         2880.59 ms โ”‚     no change โ”‚
โ”‚ QQuery 19    โ”‚    86.15 ms โ”‚           88.40 ms โ”‚     no change โ”‚
โ”‚ QQuery 20    โ”‚  1118.68 ms โ”‚         1177.86 ms โ”‚  1.05x slower โ”‚
โ”‚ QQuery 21    โ”‚  1262.10 ms โ”‚         1328.57 ms โ”‚  1.05x slower โ”‚
โ”‚ QQuery 22    โ”‚  2066.92 ms โ”‚         2236.89 ms โ”‚  1.08x slower โ”‚
โ”‚ QQuery 23    โ”‚  7344.24 ms โ”‚         7580.33 ms โ”‚     no change โ”‚
โ”‚ QQuery 24    โ”‚   444.04 ms โ”‚          457.65 ms โ”‚     no change โ”‚
โ”‚ QQuery 25    โ”‚   301.35 ms โ”‚          313.55 ms โ”‚     no change โ”‚
โ”‚ QQuery 26    โ”‚   437.08 ms โ”‚          446.77 ms โ”‚     no change โ”‚
โ”‚ QQuery 27    โ”‚  1521.59 ms โ”‚         1569.90 ms โ”‚     no change โ”‚
โ”‚ QQuery 28    โ”‚ 11853.42 ms โ”‚        12776.43 ms โ”‚  1.08x slower โ”‚
โ”‚ QQuery 29    โ”‚   520.67 ms โ”‚          522.14 ms โ”‚     no change โ”‚
โ”‚ QQuery 30    โ”‚   778.74 ms โ”‚          798.51 ms โ”‚     no change โ”‚
โ”‚ QQuery 31    โ”‚   795.04 ms โ”‚          802.20 ms โ”‚     no change โ”‚
โ”‚ QQuery 32    โ”‚  2425.30 ms โ”‚         2434.52 ms โ”‚     no change โ”‚
โ”‚ QQuery 33    โ”‚  3152.55 ms โ”‚         3222.42 ms โ”‚     no change โ”‚
โ”‚ QQuery 34    โ”‚  3241.80 ms โ”‚         3218.07 ms โ”‚     no change โ”‚
โ”‚ QQuery 35    โ”‚  1285.41 ms โ”‚         1276.66 ms โ”‚     no change โ”‚
โ”‚ QQuery 36    โ”‚   120.40 ms โ”‚          125.16 ms โ”‚     no change โ”‚
โ”‚ QQuery 37    โ”‚    50.62 ms โ”‚           51.33 ms โ”‚     no change โ”‚
โ”‚ QQuery 38    โ”‚   121.37 ms โ”‚          120.85 ms โ”‚     no change โ”‚
โ”‚ QQuery 39    โ”‚   188.72 ms โ”‚          200.24 ms โ”‚  1.06x slower โ”‚
โ”‚ QQuery 40    โ”‚    41.17 ms โ”‚           42.82 ms โ”‚     no change โ”‚
โ”‚ QQuery 41    โ”‚    40.66 ms โ”‚           37.95 ms โ”‚ +1.07x faster โ”‚
โ”‚ QQuery 42    โ”‚    32.46 ms โ”‚           32.62 ms โ”‚     no change โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Benchmark Summary                 โ”ƒ            โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Total Time (HEAD)                 โ”‚ 53266.50ms โ”‚
โ”‚ Total Time (hash-join-pushdown)   โ”‚ 55169.23ms โ”‚
โ”‚ Average Time (HEAD)               โ”‚  1238.76ms โ”‚
โ”‚ Average Time (hash-join-pushdown) โ”‚  1283.01ms โ”‚
โ”‚ Queries Faster                    โ”‚          3 โ”‚
โ”‚ Queries Slower                    โ”‚          5 โ”‚
โ”‚ Queries with No Change            โ”‚         35 โ”‚
โ”‚ Queries with Failure              โ”‚          0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
--------------------
Benchmark tpch_mem_sf1.json
--------------------
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Query        โ”ƒ      HEAD โ”ƒ hash-join-pushdown โ”ƒ       Change โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ QQuery 1     โ”‚  99.04 ms โ”‚          100.58 ms โ”‚    no change โ”‚
โ”‚ QQuery 2     โ”‚  20.94 ms โ”‚           20.34 ms โ”‚    no change โ”‚
โ”‚ QQuery 3     โ”‚  32.40 ms โ”‚           33.10 ms โ”‚    no change โ”‚
โ”‚ QQuery 4     โ”‚  17.88 ms โ”‚           18.93 ms โ”‚ 1.06x slower โ”‚
โ”‚ QQuery 5     โ”‚  49.33 ms โ”‚           50.27 ms โ”‚    no change โ”‚
โ”‚ QQuery 6     โ”‚  11.85 ms โ”‚           11.99 ms โ”‚    no change โ”‚
โ”‚ QQuery 7     โ”‚  85.41 ms โ”‚           89.71 ms โ”‚ 1.05x slower โ”‚
โ”‚ QQuery 8     โ”‚  24.66 ms โ”‚           23.71 ms โ”‚    no change โ”‚
โ”‚ QQuery 9     โ”‚  54.04 ms โ”‚           52.89 ms โ”‚    no change โ”‚
โ”‚ QQuery 10    โ”‚  43.01 ms โ”‚           42.22 ms โ”‚    no change โ”‚
โ”‚ QQuery 11    โ”‚  11.40 ms โ”‚           11.27 ms โ”‚    no change โ”‚
โ”‚ QQuery 12    โ”‚  34.76 ms โ”‚           35.10 ms โ”‚    no change โ”‚
โ”‚ QQuery 13    โ”‚  26.29 ms โ”‚           26.60 ms โ”‚    no change โ”‚
โ”‚ QQuery 14    โ”‚   9.72 ms โ”‚            9.95 ms โ”‚    no change โ”‚
โ”‚ QQuery 15    โ”‚  18.90 ms โ”‚           19.07 ms โ”‚    no change โ”‚
โ”‚ QQuery 16    โ”‚  18.18 ms โ”‚           18.11 ms โ”‚    no change โ”‚
โ”‚ QQuery 17    โ”‚  97.90 ms โ”‚           97.01 ms โ”‚    no change โ”‚
โ”‚ QQuery 18    โ”‚ 199.45 ms โ”‚          194.61 ms โ”‚    no change โ”‚
โ”‚ QQuery 19    โ”‚  25.11 ms โ”‚           24.59 ms โ”‚    no change โ”‚
โ”‚ QQuery 20    โ”‚  32.00 ms โ”‚           31.28 ms โ”‚    no change โ”‚
โ”‚ QQuery 21    โ”‚ 143.48 ms โ”‚          145.18 ms โ”‚    no change โ”‚
โ”‚ QQuery 22    โ”‚  14.39 ms โ”‚           14.39 ms โ”‚    no change โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Benchmark Summary                 โ”ƒ           โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Total Time (HEAD)                 โ”‚ 1070.16ms โ”‚
โ”‚ Total Time (hash-join-pushdown)   โ”‚ 1070.91ms โ”‚
โ”‚ Average Time (HEAD)               โ”‚   48.64ms โ”‚
โ”‚ Average Time (hash-join-pushdown) โ”‚   48.68ms โ”‚
โ”‚ Queries Faster                    โ”‚         0 โ”‚
โ”‚ Queries Slower                    โ”‚         2 โ”‚
โ”‚ Queries with No Change            โ”‚        20 โ”‚
โ”‚ Queries with Failure              โ”‚         0 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

alamb avatar Jul 31 '25 19:07 alamb

Might be worth running TPCH or another benchmark with joins?

adriangb avatar Jul 31 '25 20:07 adriangb

It did run

Benchmark tpch_mem_sf1.json

I can run whatever benchmark you want - just let me know

alamb avatar Jul 31 '25 20:07 alamb