frozen icon indicating copy to clipboard operation
frozen copied to clipboard

Binary search on small maps

Open degski opened this issue 5 years ago • 7 comments

In the case of smallish maps, linear search (below some threshold) will outperform binary search, f.e. converting a hex char to its decimal counterpart:

frozen::map<char, char, 22> map_lookup {
    { '0',  0 }, { '1',  1 }, { '2',  2 }, { '3',  3 }, { '4',  4 }, { '5',  5 }, { '6',  6 }, { '7',  7 }, { '8',  8 },
    { '9',  9 }, { 'a', 10 }, { 'b', 11 }, { 'c', 12 }, { 'd', 13 }, { 'e', 14 }, { 'f', 15 }, { 'A', 10 }, { 'B', 11 },
    { 'C', 12 }, { 'D', 13 }, { 'E', 14 }, { 'F', 15 }
};

int table_lookup ( char s_ ) noexcept {
    struct KeyValue { const char key, value; };
    static const std::array<KeyValue, 22> table { {
        { '0',  0 }, { '1',  1 }, { '2',  2 }, { '3',  3 }, { '4',  4 }, { '5',  5 }, { '6',  6 }, { '7',  7 }, { '8',  8 },
        { '9',  9 }, { 'a', 10 }, { 'b', 11 }, { 'c', 12 }, { 'd', 13 }, { 'e', 14 }, { 'f', 15 }, { 'A', 10 }, { 'B', 11 },
        { 'C', 12 }, { 'D', 13 }, { 'E', 14 }, { 'F', 15 }
    } };
    for ( auto [ k, v ] : table )
        if ( k == s_ )
            return v;
    return -1;
}

The second approach is about 40% faster than the first one.

degski avatar Apr 21 '19 05:04 degski

@degski Can you give a full example program that shows that 40%? I tried but did not find such big differences. :-(

RokerHRO avatar Apr 21 '19 15:04 RokerHRO

@RokerHRO https://gist.github.com/degski/a10a66d12971fa5709494af1ec131a32 (it's actually 50%, table_lookup00() is faster (a lot), but is no longer comparable to the frozen_map lookup).

degski avatar Apr 22 '19 01:04 degski

@degski : OMG, that is far longer than a "minimal testcase", I didn't even understand that source completely. I'll write my own test case / benchmark program… :-(

RokerHRO avatar Apr 23 '19 07:04 RokerHRO

@RokerHRO LOL, You didn't ask for a "minimal testcase", you asked for a "full example program". It is, it tests the various implementations of the conversion of a hex character string to the numeric representation. It's described in the referenced blog-post. It should compile without issue on nix with gcc or clang, on win you'll probably need clang.

Now you've asked for a minimal test case, I'll put one together and alert you to it.

degski avatar Apr 23 '19 12:04 degski

@RokerHRO I've stripped all the garbage out, here it is: https://gist.github.com/degski/accd357a6f5c93c70d4e623a40b160c4 . What these functions do is described here https://lemire.me/blog/2019/04/17/parsing-short-hexadecimal-strings-efficiently/

I realized as well that my expression of how much faster [it works in my mind, but probably not for others] is weird, when I say 40% faster, I meant to say the the said program runs in 60% of the time of the reference program. i.e. 50% means twice as fast.

degski avatar Apr 24 '19 06:04 degski

Running your example under Windows using MSVC v14.27 I get the following results:

table_lookup:
sum  = 1638054990250
time = 1894
frozen_map:
sum  = 1638054990250
time = 664

For reference I compiled with FROZEN_NO_EXCEPTIONS and changed plf::nanotimer to std::chrono::high_resolution_clock-machinery, as well as __attribute__ ((noinline)) to __declspec(noinline)

Tradias avatar Sep 27 '20 11:09 Tradias

What are the statistics for the values you quote? Is the standard deviation or measure of error small? Are we sure that the compiler is not optimizing out the test (which can be both good & bad)?

fndry-jmg avatar Jun 09 '21 09:06 fndry-jmg