coveragepy
coveragepy copied to clipboard
COVERAGE_CORE=sysmon with branch coverage can be 2x slower than default in some cases
Describe the bug
Using COVERAGE_CORE=sysmon with branch coverage can be 2x slower than default. Note the 2x likely depends on the code whose coverage is measured. Trying different variations of the Python snippet below I have seen vary from ~20% to 2x degradation roughly.
To Reproduce
Python script:
# test-coverage.py
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import TunedThresholdClassifierCV
X, y = load_breast_cancer(return_X_y=True)
for i in range(10):
model = TunedThresholdClassifierCV(
estimator=LogisticRegression()
).fit(X, y)
Script to run:
# Just for completeness you need to have Python 3.12 to be able to use sysmon
pip install scikit-learn coverage
git clone https://github.com/scikit-learn/scikit-learn --depth 1
# Note: there are some scikit-learn specific warnings, so the grep is to keep only the timing info
(time coverage run --branch --source sklearn /tmp/test-coverage.py 2>&1) | grep total
# on my machine: 9.75s user 0.09s system 129% cpu 7.616 total
(time COVERAGE_CORE=sysmon coverage run --branch --source sklearn /tmp/test-coverage.py 2>&1) | grep total
# on my machine: 16.88s user 0.10s system 114% cpu 14.798 total
I read the other performance issues with coverage and Python 3.12 for example https://github.com/nedbat/coveragepy/issues/1665 and https://github.com/python/cpython/issues/107674 and this seems like a slightly different issue. This may be related to the fact that only statement coverage is using sysmon and not branch coverage according to https://github.com/nedbat/coveragepy/issues/1746#issuecomment-1936988445 but I still find the performance degradation somewhat surprising.
How can we reproduce the problem? Please be specific. Don't link to a failing CI job. Answer the questions below:
- What version of Python are you using? 3.12.4
- What version of coverage.py shows the problem? The output of
coverage debug sysis helpful.
-- sys -------------------------------------------------------
coverage_version: 7.5.4
coverage_module: /home/lesteve/micromamba/envs/test-coverage/lib/python3.12/site-packages/coverage/__init__.py
core: -none-
CTracer: available
plugins.file_tracers: -none-
plugins.configurers: -none-
plugins.context_switchers: -none-
configs_attempted: /home/lesteve/dev/cpython/.coveragerc
/home/lesteve/dev/cpython/setup.cfg
/home/lesteve/dev/cpython/tox.ini
/home/lesteve/dev/cpython/pyproject.toml
configs_read: -none-
config_file: None
config_contents: -none-
data_file: -none-
python: 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0]
platform: Linux-6.9.8-arch1-1-x86_64-with-glibc2.39
implementation: CPython
gil_enabled: True
executable: /home/lesteve/micromamba/envs/test-coverage/bin/python3.12
def_encoding: utf-8
fs_encoding: utf-8
pid: 510639
cwd: /home/lesteve/dev/cpython
path: /home/lesteve/micromamba/envs/test-coverage/bin
/home/lesteve/micromamba/envs/test-coverage/lib/python312.zip
/home/lesteve/micromamba/envs/test-coverage/lib/python3.12
/home/lesteve/micromamba/envs/test-coverage/lib/python3.12/lib-dynload
/home/lesteve/micromamba/envs/test-coverage/lib/python3.12/site-packages
environment: CONDA_PYTHON_EXE = /home/lesteve/micromamba/bin/python
HOME = /home/lesteve
PYTHONPATH =
command_line: /home/lesteve/micromamba/envs/test-coverage/bin/coverage debug sys
sqlite3_sqlite_version: 3.46.0
sqlite3_temp_store: 0
sqlite3_compile_options: ATOMIC_INTRINSICS=1, COMPILER=gcc-12.3.0, DEFAULT_AUTOVACUUM,
DEFAULT_CACHE_SIZE=-2000, DEFAULT_FILE_FORMAT=4,
DEFAULT_JOURNAL_SIZE_LIMIT=-1, DEFAULT_MMAP_SIZE=0, DEFAULT_PAGE_SIZE=4096,
DEFAULT_PCACHE_INITSZ=20, DEFAULT_RECURSIVE_TRIGGERS,
DEFAULT_SECTOR_SIZE=4096, DEFAULT_SYNCHRONOUS=2,
DEFAULT_WAL_AUTOCHECKPOINT=1000, DEFAULT_WAL_SYNCHRONOUS=2,
DEFAULT_WORKER_THREADS=0, DIRECT_OVERFLOW_READ, ENABLE_COLUMN_METADATA,
ENABLE_DBSTAT_VTAB, ENABLE_FTS3, ENABLE_FTS3_TOKENIZER, ENABLE_FTS4,
ENABLE_FTS5, ENABLE_GEOPOLY, ENABLE_MATH_FUNCTIONS, ENABLE_RTREE,
ENABLE_UNLOCK_NOTIFY, MALLOC_SOFT_LIMIT=1024, MAX_ATTACHED=10,
MAX_COLUMN=2000, MAX_COMPOUND_SELECT=500, MAX_DEFAULT_PAGE_SIZE=8192,
MAX_EXPR_DEPTH=10000, MAX_FUNCTION_ARG=127, MAX_LENGTH=1000000000,
MAX_LIKE_PATTERN_LENGTH=50000, MAX_MMAP_SIZE=0x7fff0000,
MAX_PAGE_COUNT=0xfffffffe, MAX_PAGE_SIZE=65536, MAX_SQL_LENGTH=1000000000,
MAX_TRIGGER_DEPTH=1000, MAX_VARIABLE_NUMBER=250000, MAX_VDBE_OP=250000000,
MAX_WORKER_THREADS=8, MUTEX_PTHREADS, SECURE_DELETE, SYSTEM_MALLOC,
TEMP_STORE=1, THREADSAFE=1
- What versions of what packages do you have installed? The output of
pip freezeis helpful.
coverage==7.5.4
joblib==1.4.2
numpy==2.0.0
scikit-learn==1.5.1
scipy==1.14.0
setuptools==70.2.0
threadpoolctl==3.5.0
wheel==0.43.0
Expected behavior
Usins COVERAGE_CORE=sysmon improves performance and in the worst case may not improve it much but at least does not degrade it
Additional context
We originally saw this in the scikit-learn CI https://github.com/scikit-learn/scikit-learn/pull/29444#issuecomment-2219550662.
Yes, the sysmon support is not yet in place to do a good job with branch coverage. I could add a warning to that effect if it would help.
Thanks for your answer, I guess this was more of a "is this expected" question than a call for action.
I am not sure whether a warning would be that useful, in particular because in our use case we use coverage via pytest-cov and I think pytest would capture the warning which would kind of hide it amongst a number of other warnings that happened when we run our tests.
Another option: refuse to run with COVERAGE_CORE=sysmon and branch=True.
Why not error indeed. I guess it would make sense to do this if you think that in a vast majority of cases the performance will be worse using COVERAGE_CORE=sysmon with branch=True compared to branch=True with COVERAGE_CORE unset.
In my case, we did see a (moderate) improvement in performance when setting COVERAGE_CORE=sysmon while still using branch coverage. So I at least like that combining them does not result in an error (the performance improvement was much bigger if I also disabled branch coverage, but for now we prefer retaining that even with some perf cost).
Another option: refuse to run with COVERAGE_CORE=sysmon and branch=True.
Please don't :)
We (https://github.com/scrapy/scrapy/) see almost a 2x speedup with COVERAGE_CORE=sysmon, or rather we see an almost 2x slowdown on 3.12 compared to e.g. 3.9 (we spawn many subprocesses that take up to 2-3s to initialize, because under coverage on 3.12 importing things like hpack.huffman_table takes unreasonably long time) which is completely mitigated by COVERAGE_CORE=sysmon, and we use branch coverage. I just compared the runtime with branch coverage enabled and disabled on 3.12 with COVERAGE_CORE=sysmon and it looks like it adds just 15% of total run time.
To try to be explicit (as the original reporter), I am completely fine with the status quo of not erroring.
I was surprised originally because I had naive assumptions about COVERAGE_CORE=sysmon as a magical tweak that would always speed things up. If someone ends up in the same situation, then this issue is here to clarify matters.
In the end in our case (https://github.com/scikit-learn/scikit-learn) we accepted the trade-off of having a faster CI but using COVERAGE_CORE=sysmon and branch=false, previously we were using branch=true and not setting COVERAGE_CORE. Our slowest (using Python 3.12) went down from ~50 minutes to ~20-30 minutes see https://github.com/scikit-learn/scikit-learn/pull/29473#issuecomment-2500006337 if you are curious about the details.
BTW, now there's a new implementation of branch measurement with COVERAGE_CORE=sysmon: https://nedbatchelder.com/blog/202503/faster_branch_coverage_measurement.html
Early reports are that it is still slower than traditional branch measurement, which is disappointing. I'm wondering how it will perform for you...
I think this is now faster than traditional coverage. I will close this issue, let me know if I was wrong.