codon icon indicating copy to clipboard operation
codon copied to clipboard

Simple python program executes more slowly with codon than with python 3.11

Open peterlietz opened this issue 1 year ago • 7 comments

Hi,

I tested codon on a simple, self-contained python program (code below), which I thought might be type of algorithm to benefit from compilation. To my surprise, the runtime increased. Since it was suggested in the FAQ to open an issue in such instances, here we go.

Best regards, Peter

$ time python murphy.py 
5.797

real	1m9.516s
user	1m9.389s
sys	0m0.063s 

$ codon build -release -exe murphy.py
ld: warning: dylib (/Users/peter/.codon/lib/codon/libcodonrt.dylib) was built for newer macOS version (11.7) than being linked (11.3)
ld: warning: object file (murphy.o) was built for newer macOS version (13.0) than being linked (11.3)
ld: warning: dylib (/Users/peter/.codon/lib/codon/libomp.dylib) was built for newer macOS version (11.7) than being linked (11.3)

$ time ./murphy
5.797

real	1m25.066s
user	1m30.261s
sys	1m40.128s
import random
from collections import defaultdict
from functools import partial
from itertools import combinations

NUM_CARDS = 12
NUM_CHOSEN = 4
ROW_LENGTH = 3


def zip_fun(*fs):
    def result(x):
        return tuple(f(x) for f in fs)

    return result


def combs(k=NUM_CHOSEN):
    for s in combinations(range(NUM_CARDS), k):
        # yield frozenset(s)
        yield set(s)


def display(s):
    result = ["X" if i in s else "-" for i in range(NUM_CARDS)]
    result = [
        "".join(result[(ROW_LENGTH * i) : (ROW_LENGTH * i + ROW_LENGTH)])
        for i in range(NUM_CARDS // ROW_LENGTH)
    ]
    result = "\n".join(result)
    return result


def minimal(f, xs):
    min_value = None
    result = None
    for x in xs:
        f_x = f(x)
        if min_value is None or f_x < min_value:
            min_value = f_x
            result = x
    return result


def quality1(x):
    a = max(i % ROW_LENGTH for i in x) - min(i % ROW_LENGTH for i in x)
    b = max(i // ROW_LENGTH for i in x) - min(i // ROW_LENGTH for i in x)
    return a * b


def quality2(x):
    return len(set(i % ROW_LENGTH for i in x)) + len(set(i // ROW_LENGTH for i in x))


def score_prob(scenarios, subset):
    """
    lower is better
    """
    d = defaultdict(int)
    for s in scenarios:
        d[len(s & subset)] += 1
    return sum(v**2 for v in d.values())


def score_worst(scenarios, subset):
    """
    lower is better
    """
    d = defaultdict(int)
    for s in scenarios:
        d[len(s & subset)] += 1
    return max(d.values())


def suggest_question(scenarios, n):
    # prob:         5.797  5.8025  5.812
    # prob + worst: 5.838  5.842
    # worst + prob: 5.891
    # worst:        6.189
    return minimal(
        zip_fun(
            partial(score_prob, scenarios),  # probabilistic
            # partial(score_worst, scenarios),  # worst case
            quality1,
            quality2,
        ),
        combs(n),
    )


def filter_scenarios(scenarios, choice, a):
    for s in scenarios:
        if len(s & choice) == a:
            yield s


def play(dice_fun, answer_fun):
    scenarios = list(combs())
    while len(scenarios) > 1:
        n = dice_fun()
        choice = suggest_question(scenarios, n)
        a = answer_fun(choice)
        scenarios = list(filter_scenarios(scenarios, choice, a))
    assert len(scenarios) == 1
    return scenarios[0]


# def user_dice():
#     return int(input("\nWürfel: "))


# def user_answer(choice):
#     return int(input(f"{display(choice)}: "))


# def main():
#     s = play(user_dice, user_answer)
#     print("\nLösung:")
#     print(display(s))


def test_murphy():
    solutions = list(combs())
    NUM_EXPERIMENTS = 1_000
    MURPHY_DICE = (2, 3, 3, 4, 4, 5)

    total = 0
    random.seed(0)
    for _ in range(NUM_EXPERIMENTS):
        turns = 0
        solution = random.choice(solutions)

        def dice():
            nonlocal turns
            turns += 1
            return random.choice(MURPHY_DICE)

        def answer(choice):
            nonlocal solution
            return len(solution & choice)

        s = play(dice, answer)
        assert solution == s
        total += turns
    print(total / NUM_EXPERIMENTS)


if __name__ == "__main__":
    # main()
    test_murphy()

peterlietz avatar Sep 10 '23 19:09 peterlietz

PS: codon is in the lead slightly on fedora 38. However, I can produce a variant of the code where codon performance is worse than python's on fedora as well.

$ codon build -release -exe murphy.py 
$ time ./murphy
5.797

real	2m23.875s
user	2m21.629s
sys	0m0.157s
$ python
Python 3.11.5 (main, Aug 28 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 
$ time python murphy.py 
5.797

real	2m29.715s
user	2m27.071s
sys	0m0.054s
$ cat /etc/redhat-release 
Fedora release 38 (Thirty Eight)

peterlietz avatar Sep 11 '23 10:09 peterlietz

I tried memorizing suggest_question. This was done manually, as the lru_cache decorator does not stringify the arguments; e.g. TypeError: unhashable type: 'list'.

SUGGEST_CACHE = dict()
def suggest_question(scenarios, n):
    # prob:         5.797  5.8025  5.812
    # prob + worst: 5.838  5.842
    # worst + prob: 5.891
    # worst:        6.189
    try:
        return SUGGEST_CACHE[f"{scenarios} {n}"]
    except KeyError:
        SUGGEST_CACHE[f"{scenarios} {n}"] = minimal(
            zip_fun(
                partial(score_prob, scenarios),  # probabilistic
                # partial(score_worst, scenarios),  # worst case
                quality1,
                quality2,
            ),
            combs(n),
        )
        return SUGGEST_CACHE[f"{scenarios} {n}"]

The change enables the demonstration to complete in 1/6th the time.

Edit: Renamed SUGGEST_DICT to SUGGEST_CACHE.

marioroy avatar Sep 12 '23 07:09 marioroy

Thank you very much, @marioroy, that is an excellent idea. After memoization, the codon version is 31% faster than python.

peterlietz avatar Sep 12 '23 07:09 peterlietz

@marioroy , I'm generating Fibonacci series with python and the performance results are similar.

Python native is 16X faster than codon.

~/r/p/c/try-codon ❯❯❯ time python fib.py                                                                                                                                        
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

________________________________________________________
Executed in   46.95 millis    fish           external
   usr time   12.97 millis    0.07 millis   12.90 millis
   sys time   12.87 millis    3.67 millis    9.20 millis

~/r/p/c/try-codon ❯❯❯ time codon run fib.py                                                                                                                                     
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

________________________________________________________
Executed in  756.86 millis    fish           external
   usr time  638.98 millis    0.08 millis  638.90 millis
   sys time   56.15 millis    3.37 millis   52.79 millis

Code :

def fib(n):
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a+b
    print()

fib(1000)

iamshreeram avatar Oct 08 '23 11:10 iamshreeram

@iamshreeram Compile and run with optimizations with the -release option: time codon run -release fib.py

elisbyberi avatar Oct 08 '23 14:10 elisbyberi

@iamshreeram Note that time codon run -release fib.py measures the time to compile and run the program. You should build it first and then time the execution.

Build: codon build -release fib.py Time: time ./fib

elisbyberi avatar Oct 09 '23 21:10 elisbyberi

@elisbyberi , Thanks. That works!

iamshreeram avatar Oct 10 '23 13:10 iamshreeram