tidb
tidb copied to clipboard
executor: introduce tag pointer in hash join v2
What problem does this PR solve?
Issue Number: ref #53127
Problem Summary:
What changed and how does it work?
This pr add tagged pointer in hash join v2.
The basic idea of tagged pointer is for all the value of unsafe.Pointer, the first N(usually N is at least 16) MSB is all zero. So we can use these N bit to store extra information.
In Hash join v2, we use the N bit to store part of the value of hash value:
During hash join v2 build, we store the row address to the related slot in hash table. In this pr, instead of store the row address in hash table, we store the tagged pointer in hash table, and the tagged value is extracted from hash value.
For example
raw address is: 0x0000xxxxxxxxxxxx
hash value is 0xabcdxxxxxxxxxxxx
then before this pr, we save 0x0000xxxxxxxxxxxx into hash table, after this pr we save 0xabcdxxxxxxxxxxxx into hash table
During the probe stage, we can first use 0xabcd000000000000 to test against the hash value:
- if
0xabcd000000000000 & hashvalue==0xabcd000000000000, it means the hash value might match, we need further compare the join key - if
0xabcd000000000000 & hashvalue!=0xabcd000000000000, it means the hash value is different, then the probe result is definitely false, there is no need to further compare of the join key.
Test tag pointer on TPCH-50 dataset:
Build time: almost the same with/without tag pointer
Probe time: probe time is reduced by 15% with tag pointer
Probe collision: probe collision is reduce greatly with tag pointer(from 1594141458 to 232887049)
Check List
Tests
- [x] Unit test
- [ ] Integration test
- [x] Manual test (add detailed scripts or steps below)
- [ ] No need to test
- [ ] I checked and no code files have been changed.
Side effects
- [ ] Performance regression: Consumes more CPU
- [ ] Performance regression: Consumes more Memory
- [ ] Breaking backward compatibility
Documentation
- [ ] Affects user behaviors
- [ ] Contains syntax changes
- [ ] Contains variable changes
- [ ] Contains experimental features
- [ ] Changes MySQL compatibility
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
None
Hi @windtalker. Thanks for your PR.
PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Codecov Report
Attention: Patch coverage is 96.80851% with 3 lines in your changes missing coverage. Please review.
Project coverage is 58.1764%. Comparing base (
3419bde) to head (21e0671). Report is 25 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #55470 +/- ##
=================================================
- Coverage 73.0085% 58.1764% -14.8321%
=================================================
Files 1584 1749 +165
Lines 443036 641533 +198497
=================================================
+ Hits 323454 373221 +49767
- Misses 99870 243872 +144002
- Partials 19712 24440 +4728
| Flag | Coverage Δ | |
|---|---|---|
| integration | 40.4564% <81.9148%> (?) |
|
| unit | 73.7870% <96.8085%> (+1.6228%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Components | Coverage Δ | |
|---|---|---|
| dumpling | 54.5253% <ø> (+1.5686%) |
:arrow_up: |
| parser | ∅ <ø> (∅) |
|
| br | 53.1736% <ø> (+7.7154%) |
:arrow_up: |
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: wshwsh12, XuHuaiyu
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [XuHuaiyu,wshwsh12]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
[LGTM Timeline notifier]
Timeline:
what's the overall performance gain when running some SQL with hash job?
what's the overall performance gain when running some SQL with hash job?
you mean hash join v2 vs hash join v1 or hash join v2 with tagged pointer vs hash join v2 without tagged pointer?
what's the overall performance gain when running some SQL with hash job?
you mean hash join v2 vs hash join v1 or hash join v2 with tagged pointer vs hash join v2 without tagged pointer?
hash job v2 I suppose? before and after this PR
what's the overall performance gain when running some SQL with hash job?
you mean hash join v2 vs hash join v1 or hash join v2 with tagged pointer vs hash join v2 without tagged pointer?
hash job v2 I suppose? before and after this PR
For this pr, test on TPCH-50 shows that there is no significant overall performance gains between hash join v2 with tagged pointer and hash join v2 without tagged pointer. When looking into the hash join itself, the probe time is 15% faster with tagged pointer, and the probe collision is reduced by 10 times, which I think can effectively reduce the CPU time of hash join
which I think can effectively reduce the CPU time of hash join
any compare result about this part? I personally prefer other ways of optimization to bit manipulation, unless we have no choice and it does show significant improve
/hold
hold for a while to get more test result.
which I think can effectively reduce the CPU time of hash join
any compare result about this part? I personally prefer other ways of optimization to bit manipulation, unless we have no choice and it does show significant improve
Probe collision: probe collision is reduce from 1594141458 to 232887049 For each probe collision, probe need to do: https://github.com/pingcap/tidb/blob/eb19c100db5effe6ca86c8cd75ddfac1dde61e38/pkg/executor/join/inner_join_probe.go#L48-L58
- compare the join key(
isKeyMatched) - advance current row to the next row(
getNextRowAddress)
So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.
Since hash join v2 use a much simple hash table, it is by design that it may meet many probe colision, especially for the cases that the build side has many duplicate keys. The problem this pr tries to solve is actually
- For extrame cases(build side has many duplicated keys), avoid big performance regression due to probe colission
- For normal cases, the overhead should be as small as possible, so it will not introduce performance regression
In order to avoid probe colision, the basic idea is to use some quick filter. There is at least 3 intuitive approaches
- use tagged pointer
- similar with tagged pointer, but instead of save part of hash value into pointer, each row allocated an extra 8 bytes to save information of the hash value
- for each hash table slot, use a bloom filter
For all the 3 candidates, tagged pointer has the smallest cost, and according to the tests results, it will not cause performance regression. For method 2, it need extra 8 byte for each build row, it cause more memory usage and may cause performance regression for some simple query. For method 3, it need extra memory for each hash table slot, and the cost of construct bloom filter is relatively big and may cause performance regression for simple query.
So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.
Thanks for the detailed explanation. it makes sense, but it would be more helpful if we can have a compare result of how much CPU this PR can reduce
/unhold
So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.
Thanks for the detailed explanation. it makes sense, but it would be more helpful if we can have a compare result of how much CPU this PR can reduce
For a join, the explain analyze result will be something like this
| └─HashJoin_11 | 12487.50 | 1 | root | | time:533.9µs, loops:2, build_hash_table:{total:484.2µs, fetch:475.4µs, build:500ns}, probe:{concurrency:5, total:2.45ms, max:496.3µs, probe:10.2µs, fetch and wait:2.44ms}
If we simply treated the probe: 10.2us as the probe cpu time, then with this pr, the probe cpu time can be reduced by 15%. I don't have the original data now, I can re-test and record the original data later.
So I think if we can reduce the probe collision, it is for sure that we can reduce the cpu time.
Thanks for the detailed explanation. it makes sense, but it would be more helpful if we can have a compare result of how much CPU this PR can reduce
Test Dataset TPCH-50 Test query: tpch query 1 - tpch query 22
| hash join build time | hash join probe time | |
|---|---|---|
| without tagged pointer | 50341ms | 473064ms |
| with tagged pointer | 51836ms | 408343ms |