Python icon indicating copy to clipboard operation
Python copied to clipboard

Performance: 58% faster Project Euler 070

Open ManpreetXSingh opened this issue 1 year ago • 6 comments

Describe your change:

#8594

  • Benchmark:
    • Old solution: 118.2/10 seconds
    • New solution: 49.3/10 seconds
  • [ ] Add an algorithm?
  • [x] Fix a bug or typo in an existing algorithm?
  • [ ] Documentation change?

Checklist:

  • [x] I have read CONTRIBUTING.md.
  • [x] This pull request is all my own work -- I have not plagiarized.
  • [x] I know that pull requests will not be merged if they fail the automated tests.
  • [x] This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • [ ] All new Python files are placed inside an existing directory.
  • [ ] All filenames are in all lowercase characters with no spaces or dashes.
  • [x] All functions and variable names follow Python naming conventions.
  • [x] All function parameters and return values are annotated with Python type hints.
  • [x] All functions have doctests that pass the automated testing.
  • [x] All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • [x] If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-NUMBER".

ManpreetXSingh avatar Oct 15 '23 19:10 ManpreetXSingh

@ManpreetSingh2004 a good solution with filtering by prime numbers. Look at 072, I made it, you can probably apply it there too.

quant12345 avatar Oct 16 '23 20:10 quant12345

Hi @quant12345, Thank you for your suggestion! Your input is much appreciated! I noticed that the prime filtering for 072 is already implemented in sol2.py. Although it is as fast as sol1 with both taking 1 sec. What should be done in this case? Also, I've identified a bug in 'sol1.py' that appears to be related to integer overflow. This issue specifically arises on Windows platforms because, by default, numpy on Windows 64-bit uses 32-bit integers, whereas on Linux, it employs 64-bit integers, so it also went unnoticed in automated checking of this repo. See #10672. I already have 3 PRs open so bot closes new ones.

ManpreetXSingh avatar Oct 18 '23 13:10 ManpreetXSingh

@ManpreetSingh2004 now I checked it in Windows, it’s really an incorrect number if you don’t specify the type. I used your filtering(072) by prime numbers. And with this it’s faster, significantly. Below code:

test

import datetime
import numpy as np
from math import isqrt

def solution_2(limit: int = 1_000_000) -> int:
    """
    Return the number of reduced proper fractions with denominator less than limit.
    >>> solution(8)
    21
    >>> solution(1000)
    304191
    """
    primes = set(range(3, limit, 2))
    primes.add(2)
    for p in range(3, limit, 2):
        if p not in primes:
            continue
        primes.difference_update(set(range(p * p, limit, p)))

    phi = [float(n) for n in range(limit + 1)]

    for p in primes:
        for n in range(p, limit + 1, p):
            phi[n] *= 1 - 1 / p

    return int(sum(phi[2:]))


def np_calculate_prime_numbers(max_number: int) -> list[int]:
    """
    Returns prime numbers below max_number.
    See: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

    >>> np_calculate_prime_numbers(10)
    [2, 3, 5, 7]
    >>> np_calculate_prime_numbers(2)
    []
    """
    if max_number <= 2:
        return []

    # List containing a bool value for every odd number below max_number/2
    is_prime = np.ones(max_number // 2, dtype=bool)

    for i in range(3, isqrt(max_number - 1) + 1, 2):
        if is_prime[i // 2]:
            # Mark all multiple of i as not prime using list slicing
            is_prime[i ** 2 // 2:: i] = False

    primes = np.where(is_prime)[0] * 2 + 1
    primes[0] = 2
    return primes


def solution_new(limit: int = 1_000_000) -> int:
    """
    Returns an integer, the solution to the problem
    >>> solution(10)
    31
    >>> solution(100)
    3043
    >>> solution(1_000)
    304191
    """

    # generating an array from -1 to limit
    phi = np.arange(-1, limit)
    primes = np_calculate_prime_numbers(limit)

    for i in primes:
        #ind = np.arange(2 * i, limit + 1, i)  # indexes for selection
        phi[2*i::i] -= phi[2*i::i] // i

    return int(np.sum(phi[2: limit + 1], dtype=np.int64))


def solution(limit: int = 1_000_000) -> int:
    """
    Returns an integer, the solution to the problem
    >>> solution(10)
    31
    >>> solution(100)
    3043
    >>> solution(1_000)
    304191
    """

    # generating an array from -1 to limit
    phi = np.arange(-1, limit)

    for i in range(2, limit + 1):
        if phi[i] == i - 1:
            ind = np.arange(2 * i, limit + 1, i)  # indexes for selection
            phi[ind] -= phi[ind] // i

    return int(np.sum(phi[2: limit + 1], dtype=np.int64))


if __name__ == "__main__":

    n = 1_000_000

    now = datetime.datetime.now()
    old = solution(n)
    print('old_time', datetime.datetime.now() - now)

    now = datetime.datetime.now()
    new = solution_new(n)
    print('new_time', datetime.datetime.now() - now)

    now = datetime.datetime.now()
    solution_2 = solution_2(n)
    print('solution_2', datetime.datetime.now() - now)

    print('comparison for identity', old == new == solution_2)

Output:

old_time 0:00:00.773152
new_time 0:00:00.410564
solution_2 0:00:01.439802
comparison for identity True

Here is a slightly faster implementation primes of primesfrom2to than 'your' primesfrom3to.

quant12345 avatar Oct 18 '23 16:10 quant12345

@quant12345 Hey, That's really nice. 😊

ManpreetXSingh avatar Oct 18 '23 16:10 ManpreetXSingh

We only want one solution in the code.

@dhruvmanila If I understand correctly, the three slower solutions (slow_solution, slicing_solution, py_solution) should be removed. Am I correct?

ManpreetXSingh avatar Oct 29 '23 15:10 ManpreetXSingh

Hi @dhruvmanila, I've made the requested changes. Please let me know if any additional adjustments are needed. The get_totients function is modified. The remainder of the solution remains unchanged from its original form. Also I've put the benchmark results in the PR description.

ManpreetXSingh avatar Dec 29 '23 12:12 ManpreetXSingh