tidb executor: introduce tag pointer in hash join v2

What problem does this PR solve?

Issue Number: ref #53127

Problem Summary:

What changed and how does it work?

This pr add tagged pointer in hash join v2. The basic idea of tagged pointer is for all the value of unsafe.Pointer, the first N(usually N is at least 16) MSB is all zero. So we can use these N bit to store extra information. In Hash join v2, we use the N bit to store part of the value of hash value: During hash join v2 build, we store the row address to the related slot in hash table. In this pr, instead of store the row address in hash table, we store the tagged pointer in hash table, and the tagged value is extracted from hash value. For example raw address is: 0x0000xxxxxxxxxxxx hash value is 0xabcdxxxxxxxxxxxx then before this pr, we save 0x0000xxxxxxxxxxxx into hash table, after this pr we save 0xabcdxxxxxxxxxxxx into hash table During the probe stage, we can first use 0xabcd000000000000 to test against the hash value:

if 0xabcd000000000000 & hashvalue == 0xabcd000000000000, it means the hash value might match, we need further compare the join key
if 0xabcd000000000000 & hashvalue != 0xabcd000000000000, it means the hash value is different, then the probe result is definitely false, there is no need to further compare of the join key.

Test tag pointer on TPCH-50 dataset: Build time: almost the same with/without tag pointer Probe time: probe time is reduced by 15% with tag pointer Probe collision: probe collision is reduce greatly with tag pointer(from 1594141458 to 232887049)

Check List

Tests

[x] Unit test
[ ] Integration test
[x] Manual test (add detailed scripts or steps below)
[ ] No need to test
- [ ] I checked and no code files have been changed.

Side effects

[ ] Performance regression: Consumes more CPU
[ ] Performance regression: Consumes more Memory
[ ] Breaking backward compatibility

Documentation

[ ] Affects user behaviors
[ ] Contains syntax changes
[ ] Contains variable changes
[ ] Contains experimental features
[ ] Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Aug 16 '24 08:08 windtalker

Hi @windtalker. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Aug 16 '24 08:08 tiprow[bot]

Codecov Report

Attention: Patch coverage is 96.80851% with 3 lines in your changes missing coverage. Please review.

Project coverage is 58.1764%. Comparing base (3419bde) to head (21e0671). Report is 25 commits behind head on master.

Additional details and impacted files

@@                Coverage Diff                @@
##             master     #55470         +/-   ##
=================================================
- Coverage   73.0085%   58.1764%   -14.8321%     
=================================================
  Files          1584       1749        +165     
  Lines        443036     641533     +198497     
=================================================
+ Hits         323454     373221      +49767     
- Misses        99870     243872     +144002     
- Partials      19712      24440       +4728

Flag	Coverage Δ
integration	`40.4564% <81.9148%> (?)`
unit	`73.7870% <96.8085%> (+1.6228%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`54.5253% <ø> (+1.5686%)`	:arrow_up:
parser	`∅ <ø> (∅)`
br	`53.1736% <ø> (+7.7154%)`	:arrow_up:

Aug 16 '24 09:08 codecov[bot]

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wshwsh12, XuHuaiyu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [XuHuaiyu,wshwsh12]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Sep 04 '24 02:09 ti-chi-bot[bot]

[LGTM Timeline notifier]

Timeline:

2024-08-28 09:34:43.424295795 +0000 UTC m=+949278.558745918: :ballot_box_with_check: agreed by wshwsh12.
2024-09-04 02:54:44.437439613 +0000 UTC m=+413008.955492539: :ballot_box_with_check: agreed by XuHuaiyu.

Sep 04 '24 02:09 ti-chi-bot[bot]

what's the overall performance gain when running some SQL with hash job?

Sep 04 '24 03:09 D3Hunter

what's the overall performance gain when running some SQL with hash job?

you mean hash join v2 vs hash join v1 or hash join v2 with tagged pointer vs hash join v2 without tagged pointer?

Sep 04 '24 05:09 windtalker

what's the overall performance gain when running some SQL with hash job?

you mean hash join v2 vs hash join v1 or hash join v2 with tagged pointer vs hash join v2 without tagged pointer?

hash job v2 I suppose? before and after this PR

Sep 04 '24 05:09 D3Hunter

what's the overall performance gain when running some SQL with hash job?

you mean hash join v2 vs hash join v1 or hash join v2 with tagged pointer vs hash join v2 without tagged pointer?

hash job v2 I suppose? before and after this PR

For this pr, test on TPCH-50 shows that there is no significant overall performance gains between hash join v2 with tagged pointer and hash join v2 without tagged pointer. When looking into the hash join itself, the probe time is 15% faster with tagged pointer, and the probe collision is reduced by 10 times, which I think can effectively reduce the CPU time of hash join

Sep 04 '24 05:09 windtalker

which I think can effectively reduce the CPU time of hash join

any compare result about this part? I personally prefer other ways of optimization to bit manipulation, unless we have no choice and it does show significant improve

Sep 04 '24 05:09 D3Hunter

/hold

hold for a while to get more test result.

Sep 04 '24 05:09 D3Hunter

which I think can effectively reduce the CPU time of hash join

any compare result about this part? I personally prefer other ways of optimization to bit manipulation, unless we have no choice and it does show significant improve

Probe collision: probe collision is reduce from 1594141458 to 232887049 For each probe collision, probe need to do: https://github.com/pingcap/tidb/blob/eb19c100db5effe6ca86c8cd75ddfac1dde61e38/pkg/executor/join/inner_join_probe.go#L48-L58

compare the join key(isKeyMatched)
advance current row to the next row(getNextRowAddress)

So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.

Since hash join v2 use a much simple hash table, it is by design that it may meet many probe colision, especially for the cases that the build side has many duplicate keys. The problem this pr tries to solve is actually

For extrame cases(build side has many duplicated keys), avoid big performance regression due to probe colission
For normal cases, the overhead should be as small as possible, so it will not introduce performance regression

In order to avoid probe colision, the basic idea is to use some quick filter. There is at least 3 intuitive approaches

use tagged pointer
similar with tagged pointer, but instead of save part of hash value into pointer, each row allocated an extra 8 bytes to save information of the hash value
for each hash table slot, use a bloom filter

For all the 3 candidates, tagged pointer has the smallest cost, and according to the tests results, it will not cause performance regression. For method 2, it need extra 8 byte for each build row, it cause more memory usage and may cause performance regression for some simple query. For method 3, it need extra memory for each hash table slot, and the cost of construct bloom filter is relatively big and may cause performance regression for simple query.

Sep 04 '24 06:09 windtalker

So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.

Thanks for the detailed explanation. it makes sense, but it would be more helpful if we can have a compare result of how much CPU this PR can reduce

Sep 04 '24 06:09 D3Hunter

/unhold

Sep 04 '24 06:09 D3Hunter

So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.

Thanks for the detailed explanation. it makes sense, but it would be more helpful if we can have a compare result of how much CPU this PR can reduce

For a join, the explain analyze result will be something like this

| └─HashJoin_11                  | 12487.50 | 1       | root      |               | time:533.9µs, loops:2, build_hash_table:{total:484.2µs, fetch:475.4µs, build:500ns}, probe:{concurrency:5, total:2.45ms, max:496.3µs, probe:10.2µs, fetch and wait:2.44ms}

If we simply treated the probe: 10.2us as the probe cpu time, then with this pr, the probe cpu time can be reduced by 15%. I don't have the original data now, I can re-test and record the original data later.

Sep 04 '24 06:09 windtalker

So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.

Thanks for the detailed explanation. it makes sense, but it would be more helpful if we can have a compare result of how much CPU this PR can reduce

Test Dataset TPCH-50 Test query: tpch query 1 - tpch query 22

	hash join build time	hash join probe time
without tagged pointer	50341ms	473064ms
with tagged pointer	51836ms	408343ms

Sep 05 '24 01:09 windtalker

tidb tidb copied to clipboard

executor: introduce tag pointer in hash join v2

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Codecov Report

[LGTM Timeline notifier]

tidb
tidb copied to clipboard