mypy icon indicating copy to clipboard operation
mypy copied to clipboard

Mypy runs really slow on a complicated, generator-heavy function

Open nickdrozd opened this issue 1 year ago • 4 comments

I have a function yield_programs, and it calls a helper function yield_actions. Obviously they are a bit complicated; they were written like this for fun. They live together in the following file:

import re
from itertools import product
from collections.abc import Iterator

SHIFTS = 'R', 'L'
HALT = '_'


def yield_actions(
    states: int,
    colors: int,
    halt: bool = False,
) -> Iterator[str]:
    yield from filter(
        (
            lambda action: action[2] != HALT or action == '1R_'
            if halt
            else lambda action: action
        ),
        (
            ''.join(prod)
            for prod in product(
                tuple(map(str, range(colors))),
                SHIFTS,
                tuple(map(chr, range(65, 65 + states))) + ((HALT,) if halt else ()),
            )
        ),
    )


def yield_programs(
    states: int,
    colors: int,
    halt: bool,
    rejects: list[str] | None = None,
) -> Iterator[str]:
    yield from filter(
        lambda prog: not any(re.compile(regex).match(prog) for regex in rejects or []),
        (
            prog
            for prog in (
                '  '.join(state)
                for state in product(
                    (
                        ' '.join(state)
                        for state in product(
                            yield_actions(states, colors, halt), repeat=colors  # <-- call to yield_actions
                        )
                    ),
                    repeat=states,
                )
            )
            if (prog[:3] == '1RB' and (not halt or prog.count(HALT) == 1))
        ),
    )

Mypy runs instantly and finds no issues.

Now I attempt to inline the call to yield_actions:

def yield_programs(
    states: int,
    colors: int,
    halt: bool,
    rejects: list[str] | None = None,
) -> Iterator[str]:
    yield from filter(
        lambda prog: not any(re.compile(regex).match(prog) for regex in rejects or []),
        (
            prog
            for prog in (
                '  '.join(state)
                for state in product(
                    (
                        ' '.join(state)
                        for state in product(
                            filter(  # <-- inlined call
                                (
                                    lambda action: action[2] != HALT or action == '1R_'
                                    if halt
                                    else lambda action: action
                                ),
                                (
                                    ''.join(prod)
                                    for prod in product(
                                        tuple(map(str, range(colors))),
                                        SHIFTS,
                                        tuple(map(chr, range(65, 65 + states)))
                                        + ((HALT,) if halt else ()),
                                    )
                                ),
                            ),
                            repeat=colors,
                        )
                    ),
                    repeat=states,
                )
            )
            if (prog[:3] == '1RB' and (not halt or prog.count(HALT) == 1))
        ),
    )

Mypy still finds no issues. However, it now takes around 50 seconds to run. Granted this function is bizarrely complicated (please don't ask why I would want to do such a thing), but that still seems like an awfully long time.

My feeling (which could be totally wrong) is that there is something that could be cached somewhere and that would fix this immediately. Or maybe it's because of the generators.

I did some profiling. Here is the callgraph colored by time spent in each function:

color-by-self-time

And the callgraph colored by time spent in each function along with subcalls:

color-by-total-time

mypy 0.982 (compiled: no)
Python 3.11.0

nickdrozd avatar Nov 05 '22 17:11 nickdrozd

Interesting, mypy seems to be able to analyze this code much faster on Python 3.10 than on 3.11. Using mypy 0.982 in both cases, I see the following times on my machine:

Python 3.10: 4.85s Python 3.11: 18.01s

erictraut avatar Nov 05 '22 23:11 erictraut

That's probably because there are mypyc-compiled wheels only for 3.10, not 3.11.

JelleZijlstra avatar Nov 05 '22 23:11 JelleZijlstra

This is a pattern I have seen in other cases (e.g. literal list/set/map that have deeply nested calls for some values).

I believe it happens because mypy in general doesn't cache inferred types for inner nodes and ends-up repeatedly re-running type inference for every extra outer nesting. The deeper you nest, the more pronounced this effect becomes. Reducing the nesting by extracting a value or function improves the situation because then mypy only does type checking once for the extracted subtree of the AST.

I have submitted a partial fix for this in the case of simple literal list/set/dict expressions but more complex nested structures that are not handled by the existing fast path will always exhibit this polynomial performance degradation unless a significant change is done to more aggressively cache the type of inferred subtree...

huguesb avatar Nov 07 '22 20:11 huguesb

Just an update: running this with mypy 1.4.1 (compiled: yes) is still very slow.

nickdrozd avatar Jul 04 '23 17:07 nickdrozd