coveragepy COVERAGE_CORE=sysmon with branch coverage can be 2x slower than default in some cases

Describe the bug Using COVERAGE_CORE=sysmon with branch coverage can be 2x slower than default. Note the 2x likely depends on the code whose coverage is measured. Trying different variations of the Python snippet below I have seen vary from ~20% to 2x degradation roughly.

To Reproduce

Python script:

# test-coverage.py
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import TunedThresholdClassifierCV

X, y = load_breast_cancer(return_X_y=True)
for i in range(10):
    model = TunedThresholdClassifierCV(
        estimator=LogisticRegression()
    ).fit(X, y)

Script to run:

# Just for completeness you need to have Python 3.12 to be able to use sysmon
pip install scikit-learn coverage
git clone https://github.com/scikit-learn/scikit-learn --depth 1

# Note: there are some scikit-learn specific warnings, so the grep is to keep only the timing info
(time coverage run --branch --source sklearn /tmp/test-coverage.py 2>&1) | grep total
# on my machine: 9.75s user 0.09s system 129% cpu 7.616 total

(time COVERAGE_CORE=sysmon coverage run --branch --source sklearn /tmp/test-coverage.py 2>&1) | grep total
# on my machine: 16.88s user 0.10s system 114% cpu 14.798 total

I read the other performance issues with coverage and Python 3.12 for example https://github.com/nedbat/coveragepy/issues/1665 and https://github.com/python/cpython/issues/107674 and this seems like a slightly different issue. This may be related to the fact that only statement coverage is using sysmon and not branch coverage according to https://github.com/nedbat/coveragepy/issues/1746#issuecomment-1936988445 but I still find the performance degradation somewhat surprising.

How can we reproduce the problem? Please be specific. Don't link to a failing CI job. Answer the questions below:

What version of Python are you using? 3.12.4
What version of coverage.py shows the problem? The output of coverage debug sys is helpful.

-- sys -------------------------------------------------------
               coverage_version: 7.5.4
                coverage_module: /home/lesteve/micromamba/envs/test-coverage/lib/python3.12/site-packages/coverage/__init__.py
                           core: -none-
                        CTracer: available
           plugins.file_tracers: -none-
            plugins.configurers: -none-
      plugins.context_switchers: -none-
              configs_attempted: /home/lesteve/dev/cpython/.coveragerc
                                 /home/lesteve/dev/cpython/setup.cfg
                                 /home/lesteve/dev/cpython/tox.ini
                                 /home/lesteve/dev/cpython/pyproject.toml
                   configs_read: -none-
                    config_file: None
                config_contents: -none-
                      data_file: -none-
                         python: 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0]
                       platform: Linux-6.9.8-arch1-1-x86_64-with-glibc2.39
                 implementation: CPython
                    gil_enabled: True
                     executable: /home/lesteve/micromamba/envs/test-coverage/bin/python3.12
                   def_encoding: utf-8
                    fs_encoding: utf-8
                            pid: 510639
                            cwd: /home/lesteve/dev/cpython
                           path: /home/lesteve/micromamba/envs/test-coverage/bin
                                 /home/lesteve/micromamba/envs/test-coverage/lib/python312.zip
                                 /home/lesteve/micromamba/envs/test-coverage/lib/python3.12
                                 /home/lesteve/micromamba/envs/test-coverage/lib/python3.12/lib-dynload
                                 /home/lesteve/micromamba/envs/test-coverage/lib/python3.12/site-packages
                    environment: CONDA_PYTHON_EXE = /home/lesteve/micromamba/bin/python
                                 HOME = /home/lesteve
                                 PYTHONPATH = 
                   command_line: /home/lesteve/micromamba/envs/test-coverage/bin/coverage debug sys
         sqlite3_sqlite_version: 3.46.0
             sqlite3_temp_store: 0
        sqlite3_compile_options: ATOMIC_INTRINSICS=1, COMPILER=gcc-12.3.0, DEFAULT_AUTOVACUUM,
                                 DEFAULT_CACHE_SIZE=-2000, DEFAULT_FILE_FORMAT=4,
                                 DEFAULT_JOURNAL_SIZE_LIMIT=-1, DEFAULT_MMAP_SIZE=0, DEFAULT_PAGE_SIZE=4096,
                                 DEFAULT_PCACHE_INITSZ=20, DEFAULT_RECURSIVE_TRIGGERS,
                                 DEFAULT_SECTOR_SIZE=4096, DEFAULT_SYNCHRONOUS=2,
                                 DEFAULT_WAL_AUTOCHECKPOINT=1000, DEFAULT_WAL_SYNCHRONOUS=2,
                                 DEFAULT_WORKER_THREADS=0, DIRECT_OVERFLOW_READ, ENABLE_COLUMN_METADATA,
                                 ENABLE_DBSTAT_VTAB, ENABLE_FTS3, ENABLE_FTS3_TOKENIZER, ENABLE_FTS4,
                                 ENABLE_FTS5, ENABLE_GEOPOLY, ENABLE_MATH_FUNCTIONS, ENABLE_RTREE,
                                 ENABLE_UNLOCK_NOTIFY, MALLOC_SOFT_LIMIT=1024, MAX_ATTACHED=10,
                                 MAX_COLUMN=2000, MAX_COMPOUND_SELECT=500, MAX_DEFAULT_PAGE_SIZE=8192,
                                 MAX_EXPR_DEPTH=10000, MAX_FUNCTION_ARG=127, MAX_LENGTH=1000000000,
                                 MAX_LIKE_PATTERN_LENGTH=50000, MAX_MMAP_SIZE=0x7fff0000,
                                 MAX_PAGE_COUNT=0xfffffffe, MAX_PAGE_SIZE=65536, MAX_SQL_LENGTH=1000000000,
                                 MAX_TRIGGER_DEPTH=1000, MAX_VARIABLE_NUMBER=250000, MAX_VDBE_OP=250000000,
                                 MAX_WORKER_THREADS=8, MUTEX_PTHREADS, SECURE_DELETE, SYSTEM_MALLOC,
                                 TEMP_STORE=1, THREADSAFE=1

What versions of what packages do you have installed? The output of pip freeze is helpful.

coverage==7.5.4
joblib==1.4.2
numpy==2.0.0
scikit-learn==1.5.1
scipy==1.14.0
setuptools==70.2.0
threadpoolctl==3.5.0
wheel==0.43.0

Expected behavior Usins COVERAGE_CORE=sysmon improves performance and in the worst case may not improve it much but at least does not degrade it

Additional context

We originally saw this in the scikit-learn CI https://github.com/scikit-learn/scikit-learn/pull/29444#issuecomment-2219550662.

Jul 10 '24 09:07 lesteve

Yes, the sysmon support is not yet in place to do a good job with branch coverage. I could add a warning to that effect if it would help.

Jul 11 '24 14:07 nedbat

Thanks for your answer, I guess this was more of a "is this expected" question than a call for action.

I am not sure whether a warning would be that useful, in particular because in our use case we use coverage via pytest-cov and I think pytest would capture the warning which would kind of hide it amongst a number of other warnings that happened when we run our tests.

Jul 19 '24 08:07 lesteve

Another option: refuse to run with COVERAGE_CORE=sysmon and branch=True.

Jul 19 '24 11:07 nedbat

Why not error indeed. I guess it would make sense to do this if you think that in a vast majority of cases the performance will be worse using COVERAGE_CORE=sysmon with branch=True compared to branch=True with COVERAGE_CORE unset.

Jul 19 '24 14:07 lesteve

In my case, we did see a (moderate) improvement in performance when setting COVERAGE_CORE=sysmon while still using branch coverage. So I at least like that combining them does not result in an error (the performance improvement was much bigger if I also disabled branch coverage, but for now we prefer retaining that even with some perf cost).

Aug 05 '24 22:08 stianjensen

Another option: refuse to run with COVERAGE_CORE=sysmon and branch=True.

Please don't :)

We (https://github.com/scrapy/scrapy/) see almost a 2x speedup with COVERAGE_CORE=sysmon, or rather we see an almost 2x slowdown on 3.12 compared to e.g. 3.9 (we spawn many subprocesses that take up to 2-3s to initialize, because under coverage on 3.12 importing things like hpack.huffman_table takes unreasonably long time) which is completely mitigated by COVERAGE_CORE=sysmon, and we use branch coverage. I just compared the runtime with branch coverage enabled and disabled on 3.12 with COVERAGE_CORE=sysmon and it looks like it adds just 15% of total run time.

Jan 08 '25 07:01 wRAR

To try to be explicit (as the original reporter), I am completely fine with the status quo of not erroring.

I was surprised originally because I had naive assumptions about COVERAGE_CORE=sysmon as a magical tweak that would always speed things up. If someone ends up in the same situation, then this issue is here to clarify matters.

In the end in our case (https://github.com/scikit-learn/scikit-learn) we accepted the trade-off of having a faster CI but using COVERAGE_CORE=sysmon and branch=false, previously we were using branch=true and not setting COVERAGE_CORE. Our slowest (using Python 3.12) went down from ~50 minutes to ~20-30 minutes see https://github.com/scikit-learn/scikit-learn/pull/29473#issuecomment-2500006337 if you are curious about the details.

Jan 08 '25 08:01 lesteve

BTW, now there's a new implementation of branch measurement with COVERAGE_CORE=sysmon: https://nedbatchelder.com/blog/202503/faster_branch_coverage_measurement.html

Early reports are that it is still slower than traditional branch measurement, which is disappointing. I'm wondering how it will perform for you...

Mar 10 '25 11:03 nedbat

I think this is now faster than traditional coverage. I will close this issue, let me know if I was wrong.

Mar 27 '25 00:03 nedbat

coveragepy coveragepy copied to clipboard

COVERAGE_CORE=sysmon with branch coverage can be 2x slower than default in some cases

coveragepy
coveragepy copied to clipboard