Math-Prime-Util icon indicating copy to clipboard operation
Math-Prime-Util copied to clipboard

Generation of k-almost primes in a given range is slow

Open trizen opened this issue 2 years ago • 4 comments

Iterating over k-almost primes in a given range [A,B], when A and B are quite large, seems to be quite slow. But it can be made much faster! :)

Example:

use 5.020;
use strict;
use warnings;

use ntheory qw(:all);
use Time::HiRes qw(gettimeofday tv_interval);

my $count = 0;
my $t0 = [gettimeofday];

foralmostprimes {
    ++$count;
} 20, 2147483649, 4294967296;

#~ my $arr = almost_primes(20, 2147483649, 4294967296);   # this takes twice as long
#~ $count = @$arr;

say "Count = $count";
say "Took: ", tv_interval($t0, [gettimeofday]);

Output:

Count = 10427
Took: 11.249787

Here's a much faster algorithm:

use 5.020;
use strict;
use warnings;

use ntheory qw(:all);
use experimental qw(signatures);
use Time::HiRes qw(gettimeofday tv_interval);

sub divceil ($x,$y) {   # ceil(x/y)
    my $q = divint($x, $y);
    (mulint($q, $y) == $x) ? $q : ($q+1);
}

sub almost_prime_numbers ($A, $B, $k, $callback) {

    $A = vecmax($A, powint(2, $k));

    sub ($m, $p, $k) {

        if ($k == 1) {

            forprimes {
                $callback->(mulint($m, $_));
            } vecmax($p, divceil($A, $m)), divint($B, $m);

            return;
        }

        my $s = rootint(divint($B, $m), $k);

        while ($p <= $s) {

            my $t = mulint($m, $p);

            # Optional optimization for tight ranges
            if (divceil($A, $t) > divint($B, $t)) {
                $p = next_prime($p);
                next;
            }

            __SUB__->($t, $p, $k - 1);
            $p = next_prime($p);
        }
    }->(1, 2, $k);
}

my $count = 0;
my $t0 = [gettimeofday];

almost_prime_numbers(2147483649, 4294967296, 20, sub { ++$count });

say "Count = $count";
say "Took: ", tv_interval($t0, [gettimeofday]);

Output:

Count = 10427
Took: 0.006179

trizen avatar May 12 '22 08:05 trizen

This looks very similar to the range_construct_almost_prime function in almost_primes.c. It's been languishing there 3 years now. Which version (sieve or construct) is better depends on a lot of things (e.g. k, n range, n start), which is why I never got to it. With some arguments the range sieve is better, other arguments favor construction. I'll also want to compare your recursive version with mine.

I was hoping I could come up with something even better, but nothing has hit me.

Your method with the callback is certainly preferable for something like foralmostprimes { }.

danaj avatar May 03 '23 14:05 danaj

So this does turn out to be significantly faster for many examples. Sadly they are not in order so if we're using the values we still need a sort. The sort often takes the majority of the time.

danaj avatar May 04 '23 10:05 danaj

In theory this is resolved in ad82672b471441051a3c09caea34c506cff77193 today.

foralmostprimes:

time perl -Iblib/lib -Iblib/arch /tmp/f68-2.pl 
Count = 10427
Took: 0.001043

almost primes then count the array:

time perl -Iblib/lib -Iblib/arch /tmp/f68-2.pl 
Count = 10427
Took: 0.000925

almost primes then count all in PP (this is generating the array, not doing the callback with the count, which would be faster)

MPU_NO_XS=1 MPU_NO_GMP=1 time perl -Iblib/lib -Iblib/arch /tmp/f68-2.pl 
Count = 10427
Took: 0.041728

However, foralmostprimes {} in PP is still abysmally slow as it's walking every integer and calling is_almost_prime(). Ugh. It'll get a rewrite.

danaj avatar May 04 '23 13:05 danaj

Should be fixed in PP foralmostprimes {} now as well.

danaj avatar May 06 '23 14:05 danaj

I confirm the fix. Thanks! :)

trizen avatar Apr 28 '24 11:04 trizen