pytest-pyodide icon indicating copy to clipboard operation
pytest-pyodide copied to clipboard

pytest_pyodide.hook make collection very slow even if @run_in_pyodide is never used

Open antocuni opened this issue 2 months ago • 2 comments

TL;DR: if you have a moderately-sized test suite, hook.py:pytest_pycollect_makemodule make collection noticeably slower.

I bumbed into issue when working on SPy. SPy has 65 test files and ~5k total test lines:

❯ cloc --vcs=git spy/tests
      65 text files.
      65 unique files.                              
       3 files ignored.

github.com/AlDanial/cloc v 1.98  T=0.13 s (509.0 files/s, 95704.2 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          65           1813           4653           5756

I have pytest-pyodide installed in my venv, and collection takes ~3s:

❯ pytest -q --collect-only
[...]
1203 tests collected in 3.2s

If I manually comment out this function, I shave more that 1 second: https://github.com/pyodide/pytest-pyodide/blob/f97d345d4410c594d8f3748c909d3460a531a9ba/pytest_pyodide/hook.py#L243-L251

❯ pytest -q --collect-only
[...]
1203 tests collected in 1.29s

I don't fully understand what pytest_collect_makemodule does, but it seems related to @run_in_pyodide. Note that my tests use this functionality only in one test file:

❯ rg 'run_in_pyodide' spy/tests
spy/tests/test_llwasm.py
2:from pytest_pyodide import run_in_pyodide  # type: ignore
49:            self.run_in_pyodide_maybe = run_in_pyodide
53:            self.run_in_pyodide_maybe = lambda fn: fn
63:        @self.run_in_pyodide_maybe
85:        @self.run_in_pyodide_maybe
105:        @self.run_in_pyodide_maybe
124:        @self.run_in_pyodide_maybe
148:        @self.run_in_pyodide_maybe
173:        @self.run_in_pyodide_maybe
207:        @self.run_in_pyodide_maybe

I tried to do this hack and it seems to help a lot, although again I'm not really sure to understand what it does:

def pytest_pycollect_makemodule(module_path: Path, parent: Collector) -> None:
    source = module_path.read_bytes()
    if b"run_in_pyodide" in source:
        strfn = str(module_path)
        tree = ast.parse(source, filename=strfn)
        ORIGINAL_MODULE_ASTS[strfn] = tree
        tree2 = deepcopy(tree)
        rewrite_asserts(tree2, source, strfn, REWRITE_CONFIG)
        REWRITTEN_MODULE_ASTS[strfn] = tree2
    orig_pytest_pycollect_makemodule(module_path, parent)
❯ pytest -q --collect-only
[...]
1203 tests collected in 1.34s

antocuni avatar Oct 20 '25 11:10 antocuni

Thanks for the report Antonio. As far as I remember, the rewrite_asserts there is used to rewrite the assertion statement which is useful to show the local file name and the line number when the test fails in side the @run_in_pyodide.

It is correct that we don't need to do it when the test function is not using @run_in_pyodide. I think your hack is okay for a quick workaround (I can accept that as a fix if you need), while it would be nice to do it in a smarter way.

ryanking13 avatar Oct 21 '25 13:10 ryanking13

The proposed fix makes sense to me. Someone could do:

import pytest_pyodide
dec = getattr(pytest_pyodide, "_".join(["run","in","pyodide"]))

but that is unreasonable and we're already doing macro magic here.

hoodmane avatar Oct 21 '25 14:10 hoodmane