Python Performance: 58% faster Project Euler 070

Describe your change:

#8594

Benchmark:
- Old solution: 118.2/10 seconds
- New solution: 49.3/10 seconds

[ ] Add an algorithm?
[x] Fix a bug or typo in an existing algorithm?
[ ] Documentation change?

Checklist:

[x] I have read CONTRIBUTING.md.
[x] This pull request is all my own work -- I have not plagiarized.
[x] I know that pull requests will not be merged if they fail the automated tests.
[x] This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
[ ] All new Python files are placed inside an existing directory.
[ ] All filenames are in all lowercase characters with no spaces or dashes.
[x] All functions and variable names follow Python naming conventions.
[x] All function parameters and return values are annotated with Python type hints.
[x] All functions have doctests that pass the automated testing.
[x] All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
[x] If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-NUMBER".

Oct 15 '23 19:10 ManpreetXSingh

@ManpreetSingh2004 a good solution with filtering by prime numbers. Look at 072, I made it, you can probably apply it there too.

Oct 16 '23 20:10 quant12345

Hi @quant12345, Thank you for your suggestion! Your input is much appreciated! I noticed that the prime filtering for 072 is already implemented in sol2.py. Although it is as fast as sol1 with both taking 1 sec. What should be done in this case? Also, I've identified a bug in 'sol1.py' that appears to be related to integer overflow. This issue specifically arises on Windows platforms because, by default, numpy on Windows 64-bit uses 32-bit integers, whereas on Linux, it employs 64-bit integers, so it also went unnoticed in automated checking of this repo. See #10672. I already have 3 PRs open so bot closes new ones.

Oct 18 '23 13:10 ManpreetXSingh

@ManpreetSingh2004 now I checked it in Windows, it’s really an incorrect number if you don’t specify the type. I used your filtering(072) by prime numbers. And with this it’s faster, significantly. Below code:

test


import datetime
import numpy as np
from math import isqrt

def solution_2(limit: int = 1_000_000) -> int:
    """
    Return the number of reduced proper fractions with denominator less than limit.
    >>> solution(8)
    21
    >>> solution(1000)
    304191
    """
    primes = set(range(3, limit, 2))
    primes.add(2)
    for p in range(3, limit, 2):
        if p not in primes:
            continue
        primes.difference_update(set(range(p * p, limit, p)))

    phi = [float(n) for n in range(limit + 1)]

    for p in primes:
        for n in range(p, limit + 1, p):
            phi[n] *= 1 - 1 / p

    return int(sum(phi[2:]))


def np_calculate_prime_numbers(max_number: int) -> list[int]:
    """
    Returns prime numbers below max_number.
    See: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

    >>> np_calculate_prime_numbers(10)
    [2, 3, 5, 7]
    >>> np_calculate_prime_numbers(2)
    []
    """
    if max_number <= 2:
        return []

    # List containing a bool value for every odd number below max_number/2
    is_prime = np.ones(max_number // 2, dtype=bool)

    for i in range(3, isqrt(max_number - 1) + 1, 2):
        if is_prime[i // 2]:
            # Mark all multiple of i as not prime using list slicing
            is_prime[i ** 2 // 2:: i] = False

    primes = np.where(is_prime)[0] * 2 + 1
    primes[0] = 2
    return primes


def solution_new(limit: int = 1_000_000) -> int:
    """
    Returns an integer, the solution to the problem
    >>> solution(10)
    31
    >>> solution(100)
    3043
    >>> solution(1_000)
    304191
    """

    # generating an array from -1 to limit
    phi = np.arange(-1, limit)
    primes = np_calculate_prime_numbers(limit)

    for i in primes:
        #ind = np.arange(2 * i, limit + 1, i)  # indexes for selection
        phi[2*i::i] -= phi[2*i::i] // i

    return int(np.sum(phi[2: limit + 1], dtype=np.int64))


def solution(limit: int = 1_000_000) -> int:
    """
    Returns an integer, the solution to the problem
    >>> solution(10)
    31
    >>> solution(100)
    3043
    >>> solution(1_000)
    304191
    """

    # generating an array from -1 to limit
    phi = np.arange(-1, limit)

    for i in range(2, limit + 1):
        if phi[i] == i - 1:
            ind = np.arange(2 * i, limit + 1, i)  # indexes for selection
            phi[ind] -= phi[ind] // i

    return int(np.sum(phi[2: limit + 1], dtype=np.int64))


if __name__ == "__main__":

    n = 1_000_000

    now = datetime.datetime.now()
    old = solution(n)
    print('old_time', datetime.datetime.now() - now)

    now = datetime.datetime.now()
    new = solution_new(n)
    print('new_time', datetime.datetime.now() - now)

    now = datetime.datetime.now()
    solution_2 = solution_2(n)
    print('solution_2', datetime.datetime.now() - now)

    print('comparison for identity', old == new == solution_2)

Output:

old_time 0:00:00.773152
new_time 0:00:00.410564
solution_2 0:00:01.439802
comparison for identity True

Here is a slightly faster implementation primes of primesfrom2to than 'your' primesfrom3to.

Oct 18 '23 16:10 quant12345

@quant12345 Hey, That's really nice. 😊

Oct 18 '23 16:10 ManpreetXSingh

We only want one solution in the code.

@dhruvmanila If I understand correctly, the three slower solutions (slow_solution, slicing_solution, py_solution) should be removed. Am I correct?

Oct 29 '23 15:10 ManpreetXSingh

Hi @dhruvmanila, I've made the requested changes. Please let me know if any additional adjustments are needed. The get_totients function is modified. The remainder of the solution remains unchanged from its original form. Also I've put the benchmark results in the PR description.

Dec 29 '23 12:12 ManpreetXSingh

Python Python copied to clipboard

Performance: 58% faster Project Euler 070

Describe your change:

Checklist:

Python
Python copied to clipboard