mypyc icon indicating copy to clipboard operation
mypyc copied to clipboard

Unexpectedly poor performance on simple test

Open davystrong opened this issue 2 years ago • 3 comments

I came across mypyc (always wanted something like this, and for some reason never found it!) and decided to try it out on a simple problem. The problem is a simple implementation of the sieve of Eratosthenes, calculating all primes up to 10000000:

# I also tried using i32, with no improvement
def get_primes(limit: int) -> list[int]:
    primes: list[int] = []
    not_primes: set[int] = set()
    for i in range(2, limit):
        if i not in not_primes:
            primes.append(i)
            not_primes.update(range(2*i, limit, i))
    return primes

# From a different file:
import timeit, primes
print(timeit.timeit(lambda: primes.get_primes(10000000), number=10))

I also have a Rust implementation of the same algorithm compiled as a Python package.

Running this ten times with timeit in pure Python mode takes about 46 seconds. Running the Rust version takes about 24 seconds. I expected the mypyc version to be a significant improvement over Python but probably not quite as good as a manually implemented Rust version. However, it's hardly any better than pure Python, taking about 42 seconds to complete. Am I doing something wrong? It's almost a 10% improvement, which definitely helps but given how simple the problem is, I was hoping for much better results (30 seconds, maybe less).

For completeness, this is my Rust implementation, in case there is some neat optimisation in Rust that is inherently impossible using mypyc that I haven't thought of:

use pyo3::prelude::*;
use std::collections::HashSet;

#[pyfunction]
fn get_primes(limit: u32) -> Vec<u32> {
    let mut not_primes = HashSet::new();
    let mut primes = Vec::new();
    for i in 2..limit {
        if !not_primes.contains(&i) {
            primes.push(i);
            for j in (2 * i..limit).step_by(i as usize) {
                not_primes.insert(j);
            }
        }
    }
    return primes;
}

I'm using mypy version 1.3.0 and Python 3.10.6 in an Ubuntu 22.04 Docker container.

davystrong avatar Jun 13 '23 10:06 davystrong

I didn't dig much into details, but I suspect a lot of time is spent in PyList_Append(). @JukkaL is now working on "compact lists" that would be much faster. IIUC they will initially only work with i32 etc., but long-term they should work with arbitrary structs (which mypy uses for compiled classes under the hood). @JukkaL can give much more details on this (and an estimated timeline).

ilevkivskyi avatar Jun 14 '23 11:06 ilevkivskyi

I don't think the bottleneck is in list.append. I think that the benchmark is spending most of the time in the Python set operations. Mypyc uses a regular Python set object behind the scenes, and it can't make it (significantly) faster. It looks like it would helpful to use a more efficient data structure. The Python set is pretty inefficient since it's implemented as a hash table and supports arbitrary key/value types. Once you avoid the use of sets, the packed arrays (compact lists) will help once they are supported by mypyc (this will likely not happen until Aug/Sep at the earliest).

JukkaL avatar Jun 15 '23 10:06 JukkaL

Ok, I had assumed that PyList_Append was being used in other places that I wasn't aware of. I ran a few more tests which seem to indicate that you're correct, it is linked to the set operations. Specifically, iterating through the range without adding all items to the set is sped up by 40% using mypyc, while adding all the items to the set (ensuring it doesn't change anything else) only results in a 13% reduction, if the items mostly don't repeat.

Thanks for the timeline!

Edit: Switching from using a set to using a list results in an improvement of almost 50% between Python and mypyc (27 seconds down to 14 seconds). However, applying the same change to my Rust implementation reduces its running time to under 1 second, so it'll be interesting to see the effect of packed arrays when they are supported.

davystrong avatar Jun 15 '23 12:06 davystrong