trino
trino copied to clipboard
Use SortedPositionLink for BETWEEN joins
Currently, we use inequality joins only for expressions like probe_symbol < build_symbol AND probe_symbol + 1 > build_symbol, but we don't inequality join for probe_symbol <> build_symbol.
Affected queries:
tpch/q21
tpcds/q16
tpcds/q19
tpcds/q46
tpcds/q64
tpcds/q68
tpcds/q94
tpcds/q95
It would be great to know how big build side per hash entry in these joins in order to determine if such optimization makes sense. cc @skrzypo987 @lukasz-stec
@sopel39 Is this issue still available to get a ticket?
@WinkerDu sure
@sopel39
I've check the code in SortExpressionVisitor (https://github.com/trinodb/trino/blob/master/core/trino-main/src/main/java/io/trino/sql/planner/SortExpressionExtractor.java#L95), I'd like to confirm what this issue suggests to do.
visitComparisonExpressiondoesn't handleNOT_EQUAL(<>) operator, we could add it to case match conditions.visitBetweenPredicatecan only handle one side ofBETWEEN(GREATER_THAN_OR_EQUAL or LESS_THAN_OR_EQUAL), we could optimize it and let it handle both sides ofBETWEENas conjunct expressions
Correct me if I am wrong, and can you assign this issue to me if appropriate, thank you.
@WinkerDu That looks correct
this looks interesting and should benefit the join queries. as the old PR is not active, can raise another PR, to keep it cleaner with the latest code. Will refer the the old PR (https://github.com/trinodb/trino/pull/14598).
does that sound good @sopel39 @mosabua?
Fixed via https://github.com/trinodb/trino/pull/22521