awesome-pytest-speedup icon indicating copy to clipboard operation
awesome-pytest-speedup copied to clipboard

Disabling bytecode generation

Open dsegan opened this issue 2 years ago • 2 comments

FWIW, I've seen significant speedups, especially with pytest, on large codebases when bytecode generation is enabled.

The nitty gritty details are that:

  • Pytest does assertion rewriting for all the modules being imported: with bytecode generation enabled, rewritten bytecode is stored in .pyc files.
  • Bytecode generation is fast in itself, but much slower when pytest needs to rewrite all the assertions
  • If you have a large number of test files, and especially non-ideally factored code where a large part of the world gets imported when running tests, it was a significant boost not to have to rewrite .pyc files: IIRC, it was going from 45s for pytest --collect to 15s for this particular codebase.

So if you are after fast pytest runs, especially for single test runs, I'd revisit the advise on PYTHONDONTWRITEBYTECODE.

dsegan avatar Jun 15 '23 08:06 dsegan

Oh, wow, that sounds incredible! Could you tell me more about the specific use case? Ideally, we'd have a large repo of dummy/generated code that we can run pytest again, with PYTHONDONTWRITEBYTECODE enabled or disabled, and compare results. This repo could then be linked as a source for the claim to enable/disable PYTHONDONTWRITEBYTECODE.

zupo avatar Jun 15 '23 09:06 zupo

It's not that hard to reproduce. I've tried to think of a decent large Python project using pytest, and sqlalchemy came to mind:

$ git clone https://github.com/sqlalchemy/sqlalchemy.git && cd sqlalchemy
$ python3 -m venv venv
$ . venv/bin/activate
$ pip3 install pytest
$ pytest --collect-only test
...
======================================= 30386 tests collected in 12.13s ========================================

real	0m14,501s
user	0m13,827s
sys	0m0,335s
$ pytest --collect-only test
...
======================================== 30386 tests collected in 6.21s ========================================

real	0m7,938s
user	0m7,664s
sys	0m0,151s
$ find -name '*.pyc' |xargs rm  # TO REPLICATE NO .pyc files being written
$ pytest --collect-only test
...
======================================= 30386 tests collected in 12.03s ========================================

real	0m14,388s
user	0m14,045s
sys	0m0,252s

Basically, if you had PYTHONDONTWRITEBYTECODE set, you'd always be getting that 14s start before getting any single test run, whereas with .pyc files there (you usually only change a few files with any change, and those will be rewritten), you are down to 8s.

Note that there is a workaround: instead of using pytest pattern matching for test names, you can specifically call out to path/to/test_file.py::TestClass::test_function when pytest does not go through the collection step.

dsegan avatar Jun 15 '23 09:06 dsegan