math icon indicating copy to clipboard operation
math copied to clipboard

Prime Number Functions

Open mborland opened this issue 4 years ago • 119 comments

An initial addition of two prime number functions:

prime_sieve is a linear prime sieve algorithm. Currently benchmarks O(n).

prime_range returns all the prime numbers in the range [lower_bound, upper_bound]. For the first 1000 primes it can use the already built in lookup tables; outside of that it will call prime_sieve. This is the function intended for end users to call.

Future additions would include a Miller-Rabin primality test.

mborland avatar Jul 10 '20 04:07 mborland

FYI there's a Miller Rabin test in Multiprecision already.

jzmaddock avatar Jul 10 '20 08:07 jzmaddock

@mborland : OMG thank you; we've needed a prime sieve for so long.

NAThompson avatar Jul 10 '20 13:07 NAThompson

@mborland : I would definitely change the name from prime_functions.hpp to prime_sieve.hpp; the first is not very memorable.

NAThompson avatar Jul 10 '20 13:07 NAThompson

@mborland : IIRC, the Joy of Factoring also creates bitsets which set prime at bit i and has it zeroed at composites. Is this a useful API?

NAThompson avatar Jul 11 '20 02:07 NAThompson

@NAThompson That sounds like wheel factorization. That method is added to the sieve of Eratosthenes, but would not be applicable here.

mborland avatar Jul 11 '20 02:07 mborland

Would it be more ergonomic to have prime_range use a back_inserter, but prime_sieve behave more like std::fill?

std::vector<int64_t> primes(1000); // gimme 1000 primes
prime_sieve(primes.begin(), primes.end());

(I need to stay in my lane here; @jeremy-murphy is much better at ergonomics.)

NAThompson avatar Jul 11 '20 13:07 NAThompson

@mborland : I have begun a performance comparison with Kim Walish's prime sieve. Here is the code:

#include <vector>
#include <boost/math/special_functions/prime_sieve.hpp>
#include <benchmark/benchmark.h>
#include <primesieve.hpp>

template <class Z>
void prime_sieve(benchmark::State& state)
{
    Z upper = static_cast<Z>(state.range(0));
    for(auto _ : state)
    {
        std::vector<Z> primes;
        benchmark::DoNotOptimize(boost::math::prime_sieve(static_cast<Z>(2), upper, std::back_inserter(primes)));
    }
    state.SetComplexityN(state.range(0));
}
BENCHMARK_TEMPLATE(prime_sieve, int64_t)->RangeMultiplier(2)->Range(1 << 1, 1 << 22)->Complexity();

template <class Z>
void kimwalish_primes(benchmark::State& state)
{

    Z upper = static_cast<Z>(state.range(0));
    for (auto _ : state)
    {
        std::vector<Z> primes;
        primesieve::generate_primes(upper, &primes);
        benchmark::DoNotOptimize(primes.back());
    }
    state.SetComplexityN(state.range(0));
}
BENCHMARK_TEMPLATE(kimwalish_primes, int64_t)->RangeMultiplier(2)->Range(1 << 1, 1 << 22)->Complexity();

BENCHMARK_MAIN();

and the results:

Run on (16 X 2300 MHz CPU s)
CPU Caches:
  L1 Data 32K (x8)
  L1 Instruction 32K (x8)
  L2 Unified 262K (x8)
  L3 Unified 16777K (x1)
Load Average: 2.49, 1.81, 1.78
----------------------------------------------------------------------------
Benchmark                                  Time             CPU   Iterations
----------------------------------------------------------------------------
prime_sieve<int64_t>/2                   348 ns          348 ns      1939182
prime_sieve<int64_t>/4                   463 ns          462 ns      1467524
prime_sieve<int64_t>/8                   603 ns          602 ns      1174654
prime_sieve<int64_t>/16                  717 ns          717 ns       956781
prime_sieve<int64_t>/32                  907 ns          907 ns       732693
prime_sieve<int64_t>/64                 1146 ns         1146 ns       604673
prime_sieve<int64_t>/128                1410 ns         1410 ns       494574
prime_sieve<int64_t>/256                2190 ns         2190 ns       319042
prime_sieve<int64_t>/512                3838 ns         3838 ns       186498
prime_sieve<int64_t>/1024               6769 ns         6769 ns       103041
prime_sieve<int64_t>/2048              12757 ns        12756 ns        54359
prime_sieve<int64_t>/4096              25295 ns        25291 ns        26912
prime_sieve<int64_t>/8192              50898 ns        50880 ns        13661
prime_sieve<int64_t>/16384            101938 ns       101936 ns         6707
prime_sieve<int64_t>/32768            201957 ns       201936 ns         3455
prime_sieve<int64_t>/65536            396735 ns       396709 ns         1754
prime_sieve<int64_t>/131072           779986 ns       779933 ns          864
prime_sieve<int64_t>/262144          1534550 ns      1534233 ns          446
prime_sieve<int64_t>/524288          3020453 ns      3019390 ns          228
prime_sieve<int64_t>/1048576         6089458 ns      6089187 ns          107
prime_sieve<int64_t>/2097152        14131817 ns     14130295 ns           44
prime_sieve<int64_t>/4194304        30953282 ns     30952476 ns           21
prime_sieve<int64_t>_BigO               0.33 NlgN       0.33 NlgN
prime_sieve<int64_t>_RMS                   9 %             9 %
kimwalish_primes<int64_t>/2              276 ns          276 ns      2263102
kimwalish_primes<int64_t>/4              280 ns          280 ns      2617321
kimwalish_primes<int64_t>/8              279 ns          279 ns      2461625
kimwalish_primes<int64_t>/16             289 ns          289 ns      2526273
kimwalish_primes<int64_t>/32             296 ns          296 ns      2348788
kimwalish_primes<int64_t>/64             287 ns          287 ns      2397589
kimwalish_primes<int64_t>/128            357 ns          357 ns      2088954
kimwalish_primes<int64_t>/256            368 ns          368 ns      1894662
kimwalish_primes<int64_t>/512            459 ns          459 ns      1601636
kimwalish_primes<int64_t>/1024          2177 ns         2176 ns       319871
kimwalish_primes<int64_t>/2048          2535 ns         2534 ns       274488
kimwalish_primes<int64_t>/4096          3377 ns         3377 ns       206241
kimwalish_primes<int64_t>/8192          4527 ns         4525 ns       155072
kimwalish_primes<int64_t>/16384         7015 ns         7014 ns        98151
kimwalish_primes<int64_t>/32768        10982 ns        10981 ns        63122
kimwalish_primes<int64_t>/65536        19631 ns        19627 ns        35826
kimwalish_primes<int64_t>/131072       35359 ns        35356 ns        19769
kimwalish_primes<int64_t>/262144       66530 ns        66523 ns        10449
kimwalish_primes<int64_t>/524288      126772 ns       126769 ns         4532
kimwalish_primes<int64_t>/1048576     246035 ns       245968 ns         2837
kimwalish_primes<int64_t>/2097152     477212 ns       477177 ns         1450
kimwalish_primes<int64_t>/4194304    1109480 ns      1109200 ns          621
kimwalish_primes<int64_t>_BigO          0.01 NlgN       0.01 NlgN
kimwalish_primes<int64_t>_RMS             11 %            11 %

So it looks like there is 1-2 orders of magnitude of performance improvement left in the boost implementation; presumably we need to find it. . .

NAThompson avatar Jul 11 '20 14:07 NAThompson

@NAThompson It would be fairly easy to create light wrappers to the current prime_sieve implementation like in the uni-variate statistics library.

I will have to do some digging to see where more performance can be squeezed out.

mborland avatar Jul 11 '20 14:07 mborland

It would be fairly easy to create light wrappers to the current prime_sieve implementation like in the uni-variate statistics library.

Nah, if it's not obviously a better way to do it, I'm not interested in it.

NAThompson avatar Jul 11 '20 14:07 NAThompson

@mborland : Also, make sure to tag your commit messages with [CI SKIP] until we are about to merge.

NAThompson avatar Jul 11 '20 15:07 NAThompson

I just went through The Joy of Factoring to refresh my memory on how these sieves work. I implemented algorithm 8.2 of that book, which the author claims runs in O(J*log(log(J)) time. He also references P.A. Pritchard who has given an algorithm which runs in O(J/log(log(J)) time; see A sublinear additive sieve for finding prime numbers, Comm. ACM 24, 1981.

In any case, this is how my naive implementation of algorithm 8.2 looks:

// Copyright 2020 Matt Borland
//
// Use, modification and distribution are subject to the
// Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt
// or copy at http://www.boost.org/LICENSE_1_0.txt)

#include <vector>
#include <boost/math/special_functions/prime_sieve.hpp>
#include <benchmark/benchmark.h>
#include <primesieve.hpp>

template <class Z>
void prime_sieve(benchmark::State& state)
{
    Z upper = static_cast<Z>(state.range(0));
    for(auto _ : state)
    {
        std::vector<Z> primes;
        benchmark::DoNotOptimize(boost::math::prime_sieve(static_cast<Z>(2), upper, std::back_inserter(primes)));
    }
    state.SetComplexityN(state.range(0));
}
BENCHMARK_TEMPLATE(prime_sieve, int64_t)->RangeMultiplier(2)->Range(1 << 1, 1 << 22)->Complexity();

template <class Z>
void kimwalish_primes(benchmark::State& state)
{

    Z upper = static_cast<Z>(state.range(0));
    for (auto _ : state)
    {
        std::vector<Z> primes;
        primesieve::generate_primes(upper, &primes);
        benchmark::DoNotOptimize(primes.back());
    }
    state.SetComplexityN(state.range(0));
}
BENCHMARK_TEMPLATE(kimwalish_primes, int64_t)->RangeMultiplier(2)->Range(1 << 1, 1 << 22)->Complexity();


template<typename Z>
std::vector<bool> joy_of_factoring_bitset(const Z J)
{
    using std::sqrt;
    using std::log;
    // A vector of bools behaves like a bitset, but with runtime size:
    std::vector<bool> P(J, true);
    P[0] = false;
    P[1] = false;
    Z p = 2;
    Z root_j = sqrt(J);
    while (p <= root_j) {
        Z i = p + p;
        while (i <= J) {
            P[i] = false;
            i = i + p;
        }
        i = p + 1;
        while (i <= root_j && P[i] == false) {
            i += 1;   
        }
        p = i;
    }
    return P;   
}

template<typename Z>
std::vector<Z> joy_of_factoring_primes(const Z J)
{
    auto P = joy_of_factoring_bitset(J);
    std::vector<Z> primes;
    if (J >= 355991) {
        double inv_log = 1/log(J);
        size_t N = J*inv_log*(1+inv_log + 2.51*inv_log*inv_log);
        primes.reserve(N);
    }
    else if (J>=55) {
        double n = static_cast<double>(J)/(log(J) - 4.0);
        primes.reserve(static_cast<size_t>(n));
    }
    primes.emplace_back(2);
    for (size_t i = 3; i < J; i += 2) {
        if (P[i]) {
            primes.emplace_back(i);   
        }
    }
    return primes;
}

template <class Z>
void joy_of_factoring_bm(benchmark::State& state)
{
    
    Z upper = static_cast<Z>(state.range(0));
    for (auto _ : state)
    {
        benchmark::DoNotOptimize(joy_of_factoring_primes(upper));
    }
    state.SetComplexityN(state.range(0));
}
BENCHMARK_TEMPLATE(joy_of_factoring_bm, size_t)->RangeMultiplier(2)->Range(1 << 1, 1 << 22)->Complexity();

BENCHMARK_MAIN();

And the results:

prime_sieve<int64_t>/2                     352 ns          351 ns      1951056
prime_sieve<int64_t>/4                     470 ns          470 ns      1481710
prime_sieve<int64_t>/8                     597 ns          597 ns      1129615
prime_sieve<int64_t>/16                    738 ns          738 ns       953198
prime_sieve<int64_t>/32                    936 ns          936 ns       762660
prime_sieve<int64_t>/64                   1145 ns         1145 ns       616909
prime_sieve<int64_t>/128                  1407 ns         1407 ns       495898
prime_sieve<int64_t>/256                  2247 ns         2246 ns       314915
prime_sieve<int64_t>/512                  3715 ns         3713 ns       189649
prime_sieve<int64_t>/1024                 6664 ns         6663 ns       103458
prime_sieve<int64_t>/2048                12745 ns        12741 ns        54189
prime_sieve<int64_t>/4096                25364 ns        25360 ns        27329
prime_sieve<int64_t>/8192                51157 ns        51151 ns        13472
prime_sieve<int64_t>/16384              102583 ns       102567 ns         6739
prime_sieve<int64_t>/32768              203596 ns       203550 ns         3428
prime_sieve<int64_t>/65536              399037 ns       399001 ns         1753
prime_sieve<int64_t>/131072             781683 ns       781580 ns          876
prime_sieve<int64_t>/262144            1538995 ns      1538545 ns          448
prime_sieve<int64_t>/524288            3095005 ns      3094665 ns          230
prime_sieve<int64_t>/1048576           6326961 ns      6326208 ns          101
prime_sieve<int64_t>/2097152          14024588 ns     14024205 ns           44
prime_sieve<int64_t>/4194304          31189328 ns     31180333 ns           21
prime_sieve<int64_t>_BigO                 0.33 NlgN       0.33 NlgN
prime_sieve<int64_t>_RMS                     9 %             9 %
kimwalish_primes<int64_t>/2                279 ns          279 ns      2509536
kimwalish_primes<int64_t>/4                289 ns          289 ns      2494788
kimwalish_primes<int64_t>/8                282 ns          282 ns      2466786
kimwalish_primes<int64_t>/16               313 ns          313 ns      2333162
kimwalish_primes<int64_t>/32               293 ns          293 ns      2271474
kimwalish_primes<int64_t>/64               304 ns          304 ns      2286663
kimwalish_primes<int64_t>/128              372 ns          372 ns      2002437
kimwalish_primes<int64_t>/256              383 ns          383 ns      1830046
kimwalish_primes<int64_t>/512              444 ns          443 ns      1595158
kimwalish_primes<int64_t>/1024            2178 ns         2177 ns       318499
kimwalish_primes<int64_t>/2048            2557 ns         2556 ns       271694
kimwalish_primes<int64_t>/4096            3363 ns         3363 ns       206716
kimwalish_primes<int64_t>/8192            4591 ns         4591 ns       152819
kimwalish_primes<int64_t>/16384           6976 ns         6974 ns        97478
kimwalish_primes<int64_t>/32768          11104 ns        11103 ns        62514
kimwalish_primes<int64_t>/65536          19537 ns        19534 ns        35619
kimwalish_primes<int64_t>/131072         35760 ns        35756 ns        19317
kimwalish_primes<int64_t>/262144         66588 ns        66586 ns        10404
kimwalish_primes<int64_t>/524288        128041 ns       128034 ns         5385
kimwalish_primes<int64_t>/1048576       245994 ns       245973 ns         2815
kimwalish_primes<int64_t>/2097152       481273 ns       481164 ns         1445
kimwalish_primes<int64_t>/4194304      1110850 ns      1110702 ns          618
kimwalish_primes<int64_t>_BigO            0.01 NlgN       0.01 NlgN
kimwalish_primes<int64_t>_RMS               11 %            11 %
joy_of_factoring_bm<size_t>/2             66.1 ns         66.1 ns     10413103
joy_of_factoring_bm<size_t>/4              252 ns          252 ns      2807457
joy_of_factoring_bm<size_t>/8              385 ns          385 ns      1901306
joy_of_factoring_bm<size_t>/16             508 ns          507 ns      1370265
joy_of_factoring_bm<size_t>/32             618 ns          618 ns      1111023
joy_of_factoring_bm<size_t>/64             238 ns          238 ns      2947071
joy_of_factoring_bm<size_t>/128            334 ns          334 ns      2148722
joy_of_factoring_bm<size_t>/256            532 ns          532 ns      1259151
joy_of_factoring_bm<size_t>/512           1027 ns         1027 ns       662553
joy_of_factoring_bm<size_t>/1024          2490 ns         2490 ns       281698
joy_of_factoring_bm<size_t>/2048          5211 ns         5211 ns       130022
joy_of_factoring_bm<size_t>/4096         10454 ns        10454 ns        65775
joy_of_factoring_bm<size_t>/8192         20807 ns        20805 ns        33289
joy_of_factoring_bm<size_t>/16384        41352 ns        41351 ns        16687
joy_of_factoring_bm<size_t>/32768        82184 ns        82180 ns         8354
joy_of_factoring_bm<size_t>/65536       161913 ns       161879 ns         4242
joy_of_factoring_bm<size_t>/131072      320842 ns       320755 ns         2170
joy_of_factoring_bm<size_t>/262144      640485 ns       640433 ns         1073
joy_of_factoring_bm<size_t>/524288     1289366 ns      1289235 ns          536
joy_of_factoring_bm<size_t>/1048576    2589004 ns      2588625 ns          269
joy_of_factoring_bm<size_t>/2097152    5304918 ns      5302623 ns          130
joy_of_factoring_bm<size_t>/4194304   11621459 ns     11620593 ns           59
joy_of_factoring_bm<size_t>_BigO          0.12 NlgN       0.12 NlgN
joy_of_factoring_bm<size_t>_RMS              5 %             5 %

Algorithm 8.2 is actually competitive with Kim Walish for smaller N but starts to lag for larger N.

NAThompson avatar Jul 11 '20 19:07 NAThompson

@NAThompson I am seeing a similar result using a segmented sieve. Currently faster than kimwalish, and joy_of_factoring_bm from your post until 524288. After that kimwalish pulls ahead.

mborland avatar Jul 11 '20 19:07 mborland

Just ran this benchmark under perf:

kimwalish

So Kim Walish's prime generator does indeed compute logarithms and fill up a table of primes; quite a bit of time in malloc; quite a bit of time in fill.

However if you look at the Joy of Factoring algorithm, it spends way more time in the fill:

joy_of_factoring

So there are some opportunities there; you'll see that I'm only checking odd numbers after 2, there's probably some way to extend that to making the fill faster.

NAThompson avatar Jul 11 '20 20:07 NAThompson

@NAThompson I have found some performance improvements using C-style arrays instead of std::vector::reserve() which makes sense given that data.

mborland avatar Jul 11 '20 20:07 mborland

@NAThompson This latest iteration's benchmark delivered much better results:

prime_sieve<int64_t>/real_time_BigO 2.87 N 0.22 N
prime_sieve<int64_t>/real_time_RMS 11 % 27 %

mborland avatar Jul 13 '20 17:07 mborland

@mborland : Really cool. BTW, if you're going to do threading, write a parallel execution policy; we need the ability to not use threads. Here's an example I started (but never finished since it was just not enough compute/memory reference).

NAThompson avatar Jul 13 '20 18:07 NAThompson

@NAThompson This should fit the bill. Performance ticked down a bit to 3.07N/0.23N because I realized I was relying on the prime table to be statically linked. If you dynamically linked the previous implementation the performance was ~0.2NlgN.

mborland avatar Jul 14 '20 02:07 mborland

Performance ticked down a bit to 3.07N/0.23N because I realized I was relying on the prime table to be statically linked.

I do recommend that the prime table not be used at all; my own preference would be to deprecate this header once your sieve is in.

NAThompson avatar Jul 14 '20 13:07 NAThompson

@NAThompson That makes sense. I can add the notice.

mborland avatar Jul 14 '20 16:07 mborland

@NAThompson What are your thoughts on defining a seq_prime_sieve function instead of wrapping the parallel prime_sieve in a block that requires an execution policy. My concern is that execution policies are a C++17 facility, and everything is currently compliant with C++11 except that.

mborland avatar Jul 14 '20 16:07 mborland

@mborland : All the new features I'm adding require C++17; also the univariate_statistics.hpp file requires C++-17 and uses if constexpr to great effect, IMO.

My view is that gcc, clang and MSVC have been C++-17 compliant for some time now, so if C++-17 provides idioms that makes a function more ergonomic then we should use it. In addition, I thought that your sieve with parallel execution policies was shaping up to be one of the coolest features in all of the library. So I'm biased on that point. . .

NAThompson avatar Jul 14 '20 16:07 NAThompson

@NAThompson Works for me. I saw the notice that C++03 is just now being deprecated so I figured I would ask. I believe that GCC 11 is going to have C++17 set as the new default.

mborland avatar Jul 14 '20 16:07 mborland

I do recommend that the prime table not be used at all; my own preference would be to deprecate this header once your sieve is in.

Note that the internal use case for that table really does require fast table lookup of primes.

jzmaddock avatar Jul 14 '20 17:07 jzmaddock

Note that the internal use case for that table really does require fast table lookup of primes.

Ok, no deprecation message, but there is hope that this will be faster than reading out of program memory.

NAThompson avatar Jul 14 '20 17:07 NAThompson

@NAThompson A section to use as many threads as processors available has been added. With the range of benchmark maxed out at 2^30 it bench marks at 6.86 N/0.69N

mborland avatar Jul 14 '20 18:07 mborland

I added an additional unit test for the multi-threading section and added this set of tests to the Jamfile. Are the 2 thread tests ok to send to CI or should I comment those out as well?

mborland avatar Jul 15 '20 17:07 mborland

@mborland : What is the use case for boost::math::prime_range(100, 1000, std::back_inserter(primes));? I understand wanting all the primes < x, but I've never seen anyone need primes in a range.

Are the 2 thread tests ok to send to CI or should I comment those out as well?

Yeah send them in. Also, could you run it under -fsanitize=thread and make sure it's clean?

NAThompson avatar Jul 15 '20 18:07 NAThompson

@NAThompson I can't personally tell you a use case for prime_range, but the functionality is available in other libraries like sympy.

Locally -fsanitize=thread and fsanitize=address are both clean.

mborland avatar Jul 15 '20 19:07 mborland

@mborland : Just tried to build the docs; got the following error:

sf/number_series.qbk:325: error: Mismatched [endsect] near column 9.

NAThompson avatar Jul 15 '20 20:07 NAThompson

@mborland : Very cool; here's the comparison to Kim Walish's sieve:

#include <vector>
#include <boost/math/special_functions/prime_sieve.hpp>
#include <benchmark/benchmark.h>
#include <primesieve.hpp>

template <class Z>
void prime_sieve(benchmark::State& state)
{
    Z upper = static_cast<Z>(state.range(0));
    for(auto _ : state)
    {
        std::vector<Z> primes;
        benchmark::DoNotOptimize(boost::math::prime_sieve(std::execution::par, upper, std::back_inserter(primes)));
    }
    state.SetComplexityN(state.range(0));
}
BENCHMARK_TEMPLATE(prime_sieve, int64_t)->RangeMultiplier(2)->Range(1 << 1, 1 << 22)->Complexity(benchmark::oN)->UseRealTime();

template <class Z>
void kimwalish_primes(benchmark::State& state)
{

    Z upper = static_cast<Z>(state.range(0));
    for (auto _ : state)
    {
        std::vector<Z> primes;
        primesieve::generate_primes(upper, &primes);
        benchmark::DoNotOptimize(primes.back());
    }
    state.SetComplexityN(state.range(0));
}
BENCHMARK_TEMPLATE(kimwalish_primes, int64_t)->RangeMultiplier(2)->Range(1 << 1, 1 << 22)->Complexity(benchmark::oN)->UseRealTime();

BENCHMARK_MAIN();

Here I force it to regress against O(N) scaling since that makes comparison easier:

prime_sieve<int64_t>/2/real_time                   417 ns          416 ns      1688428
prime_sieve<int64_t>/4/real_time                   590 ns          589 ns      1132537
prime_sieve<int64_t>/8/real_time                   932 ns          930 ns       730881
prime_sieve<int64_t>/16/real_time                 1235 ns         1229 ns       573691
prime_sieve<int64_t>/32/real_time                 1477 ns         1473 ns       465590
prime_sieve<int64_t>/64/real_time                 1672 ns         1667 ns       412482
prime_sieve<int64_t>/128/real_time                2017 ns         2009 ns       356051
prime_sieve<int64_t>/256/real_time                2795 ns         2786 ns       249987
prime_sieve<int64_t>/512/real_time                4226 ns         4209 ns       166142
prime_sieve<int64_t>/1024/real_time               6851 ns         6835 ns        99070
prime_sieve<int64_t>/2048/real_time              13647 ns        13622 ns        49657
prime_sieve<int64_t>/4096/real_time              24109 ns        24061 ns        28328
prime_sieve<int64_t>/8192/real_time              88496 ns        38358 ns         7544
prime_sieve<int64_t>/16384/real_time            104841 ns        40361 ns         5822
prime_sieve<int64_t>/32768/real_time            181437 ns        65559 ns         4329
prime_sieve<int64_t>/65536/real_time            281237 ns        68548 ns         2480
prime_sieve<int64_t>/131072/real_time           572515 ns       101001 ns         1238
prime_sieve<int64_t>/262144/real_time          1155111 ns       154189 ns          614
prime_sieve<int64_t>/524288/real_time          2800881 ns       294266 ns          278
prime_sieve<int64_t>/1048576/real_time         5264969 ns       445906 ns          127
prime_sieve<int64_t>/2097152/real_time        14900995 ns      1180893 ns           56
prime_sieve<int64_t>/4194304/real_time        45743068 ns      2255611 ns           18
prime_sieve<int64_t>/real_time_BigO               9.83 N          0.54 N
prime_sieve<int64_t>/real_time_RMS                  61 %            16 %
kimwalish_primes<int64_t>/2/real_time              580 ns          465 ns      1461148
kimwalish_primes<int64_t>/4/real_time              507 ns          473 ns      1430308
kimwalish_primes<int64_t>/8/real_time              691 ns          550 ns      1165038
kimwalish_primes<int64_t>/16/real_time             681 ns          523 ns       935754
kimwalish_primes<int64_t>/32/real_time             481 ns          454 ns      1098076
kimwalish_primes<int64_t>/64/real_time             506 ns          490 ns      1000000
kimwalish_primes<int64_t>/128/real_time            585 ns          571 ns      1169549
kimwalish_primes<int64_t>/256/real_time            625 ns          613 ns      1066912
kimwalish_primes<int64_t>/512/real_time            753 ns          730 ns       895759
kimwalish_primes<int64_t>/1024/real_time          3281 ns         3228 ns       169253
kimwalish_primes<int64_t>/2048/real_time          3736 ns         3728 ns       184918
kimwalish_primes<int64_t>/4096/real_time          5086 ns         5074 ns       130160
kimwalish_primes<int64_t>/8192/real_time          7088 ns         7050 ns        97634
kimwalish_primes<int64_t>/16384/real_time        10945 ns        10921 ns        62913
kimwalish_primes<int64_t>/32768/real_time        17168 ns        17057 ns        40847
kimwalish_primes<int64_t>/65536/real_time        30259 ns        30192 ns        22623
kimwalish_primes<int64_t>/131072/real_time       55462 ns        55328 ns        12102
kimwalish_primes<int64_t>/262144/real_time      103074 ns       102894 ns         6650
kimwalish_primes<int64_t>/524288/real_time      195996 ns       195575 ns         3428
kimwalish_primes<int64_t>/1048576/real_time     380300 ns       379350 ns         1849
kimwalish_primes<int64_t>/2097152/real_time     863277 ns       861173 ns          776
kimwalish_primes<int64_t>/4194304/real_time    1719699 ns      1714099 ns          394
kimwalish_primes<int64_t>/real_time_BigO          0.41 N          0.41 N
kimwalish_primes<int64_t>/real_time_RMS              7 %             7 %

NAThompson avatar Jul 15 '20 22:07 NAThompson