tidb icon indicating copy to clipboard operation
tidb copied to clipboard

executor: introduce tag pointer in hash join v2

Open windtalker opened this issue 6 months ago • 4 comments

What problem does this PR solve?

Issue Number: ref #53127

Problem Summary:

What changed and how does it work?

This pr add tagged pointer in hash join v2. The basic idea of tagged pointer is for all the value of unsafe.Pointer, the first N(usually N is at least 16) MSB is all zero. So we can use these N bit to store extra information. In Hash join v2, we use the N bit to store part of the value of hash value: During hash join v2 build, we store the row address to the related slot in hash table. In this pr, instead of store the row address in hash table, we store the tagged pointer in hash table, and the tagged value is extracted from hash value. For example raw address is: 0x0000xxxxxxxxxxxx hash value is 0xabcdxxxxxxxxxxxx then before this pr, we save 0x0000xxxxxxxxxxxx into hash table, after this pr we save 0xabcdxxxxxxxxxxxx into hash table During the probe stage, we can first use 0xabcd000000000000 to test against the hash value:

  • if 0xabcd000000000000 & hashvalue == 0xabcd000000000000, it means the hash value might match, we need further compare the join key
  • if 0xabcd000000000000 & hashvalue != 0xabcd000000000000, it means the hash value is different, then the probe result is definitely false, there is no need to further compare of the join key.

Test tag pointer on TPCH-50 dataset: Build time: almost the same with/without tag pointer Probe time: probe time is reduced by 15% with tag pointer Probe collision: probe collision is reduce greatly with tag pointer(from 1594141458 to 232887049)

Check List

Tests

  • [x] Unit test
  • [ ] Integration test
  • [x] Manual test (add detailed scripts or steps below)
  • [ ] No need to test
    • [ ] I checked and no code files have been changed.

Side effects

  • [ ] Performance regression: Consumes more CPU
  • [ ] Performance regression: Consumes more Memory
  • [ ] Breaking backward compatibility

Documentation

  • [ ] Affects user behaviors
  • [ ] Contains syntax changes
  • [ ] Contains variable changes
  • [ ] Contains experimental features
  • [ ] Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

windtalker avatar Aug 16 '24 08:08 windtalker