typeshed icon indicating copy to clipboard operation
typeshed copied to clipboard

Performance regression in `regr_test.py`

Open AlexWaygood opened this issue 1 year ago • 4 comments

I looks like running regr_test.py on Python 3.12 in CI, rather than Python 3.11, makes the job take 20-30s longer. It used to take between 3m and 3m10s; now it takes between 3m30s and 3m40s. That's actually pretty noticeable on PRs like #11585, #11587 or https://github.com/python/typeshed/pull/11589, where we don't run mypy_primer, meaning that regr_test.py is the slowest job in CI for those PRs.

It seems good generally for us to run this job on the latest version of Python, so I'm not necessarily suggesting that we should go back to running it on Python 3.11. But I'm curious if anybody has any ideas as to why bumping the Python version would make the job slower in CI. (I've experimented on my local fork with changing it back to using 3.11, and it fixes the performance regression -- it was definitely caused by #11544, unfortunately.)

AlexWaygood avatar Mar 13 '24 10:03 AlexWaygood

Here's the first run that used Python 3.12: https://github.com/python/typeshed/actions/runs/8206228752/job/22444848542

And here's the one immediately before that, which used Python 3.11: https://github.com/python/typeshed/actions/runs/8206219736/job/22444819614?pr=11524

AlexWaygood avatar Mar 13 '24 11:03 AlexWaygood

I see something similar if I try this locally on linux:

$ time python3.11 ./tests/regr_test.py --all --verbosity QUIET

Test completed successfully!

real	1m0,876s
user	5m49,798s
sys	0m24,210s
$ time python3.12 ./tests/regr_test.py --all --verbosity QUIET

Test completed successfully!

real	1m9,412s
user	6m56,667s
sys	0m25,051s

I emptied the mypy cache beforehand each time. Is this some issue with mypy and Python 3.12?

I'm fine with using 3.11 in CI for now, although we should add a comment explaining the situation.

srittau avatar Mar 13 '24 11:03 srittau

I haven't investigated really yet (and probably won't have time until this evening at the earliest), but here's a few hypotheses I can think of:

  1. Some stdlib function we're using a lot in the test runs much slower on py312
  2. Mypy runs much slower on py312
  3. uv runs much slower on py312
  4. Say we have regression tests for our package types-foo, and types-foo has a non-types dependency on bar. We dynamically install those non-types dependencies into a venv before running the regression tests for types-foo. Maybe bar (or a dependency of bar) doesn't provide py312 wheels yet, meaning we're doing something silly like building a C extension from source?

It would be good to get to the bottom of things anyway -- if it's mypy, CPython or uv to blame, I'd be interested in letting them know.

AlexWaygood avatar Mar 13 '24 11:03 AlexWaygood

On my machine there's about a 10s difference consistently. I didn't think to clear mypy cache so take this with a grain of salt (and I've got other things to do today). But hopefully this offers some pointers.

Some mypy subprocess runs can differ up to 2s, mostly slower on Python 3.12, but not always, so it could average out over the 255 subprocess calls (so I'll refrain from posting my mypy subprocesses timing results for now)

Note that the install command with uv has a clear performance difference though. At least on my machine. (as Alex mentionned, could be caused by the installed library's Python 3.12 version, not necessarily a problem in vu itself)

Python 3.11

Checking uv subprocess commands with perf_counter

Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'MarkupSafe', '--no-cache-dir']: 11.079815700002655
Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'cryptography>=35.0.0', '--no-cache-dir']: 12.484998500000074
Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'numpy', '--no-cache-dir']: 13.562204699999711
Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'urllib3>=2', '--no-cache-dir']: 10.540249300000141
Execution time of uv venv \.venv: 0.04944420000174432
Execution time of uv venv \.venv: 0.05479460000060499
Execution time of uv venv \.venv: 0.09991710000031162
Execution time of uv venv \.venv: 0.15117509999981849

Some preliminary cprofile results

regr_test_py311.prof% sort tottime
regr_test_py311.prof% stats 20     
Wed Mar 13 16:28:21 2024    regr_test_py311.prof

         243748 function calls (238315 primitive calls) in 41.590 seconds

   Ordered by: internal time
   List reduced from 1107 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      146   33.103    0.227   33.103    0.227 {method 'acquire' of '_thread.lock' objects}
    23494    7.498    0.000    7.498    0.000 {built-in method nt.unlink}
     2155    0.417    0.000    0.417    0.000 {built-in method nt.rmdir}
  2155/17    0.171    0.000    8.302    0.488 shutil.py:598(_rmtree_unsafe)
     2155    0.098    0.000    0.098    0.000 {built-in method nt.scandir}
    25631    0.062    0.000    0.062    0.000 {method 'stat' of 'nt.DirEntry' objects}
    25631    0.049    0.000    0.118    0.000 shutil.py:570(_rmtree_isdir)
      479    0.017    0.000    0.018    0.000 {built-in method nt.stat}
       56    0.012    0.000    0.012    0.000 {built-in method io.open_code}
        3    0.012    0.004    0.012    0.004 {built-in method _imp.create_dynamic}
   407/55    0.011    0.000    0.033    0.001 _parser.py:509(_parse)
    25647    0.007    0.000    0.007    0.000 {built-in method _stat.S_ISDIR}
    11583    0.006    0.000    0.006    0.000 _parser.py:233(__next)
       56    0.006    0.000    0.006    0.000 {built-in method marshal.loads}
      810    0.005    0.000    0.008    0.000 <frozen importlib._bootstrap_external>:96(_path_join)
   678/51    0.005    0.000    0.014    0.000 _compiler.py:37(_compile)
    10667    0.005    0.000    0.011    0.000 _parser.py:254(get)
       17    0.004    0.000    0.004    0.000 {built-in method nt.mkdir}
      259    0.004    0.000    0.005    0.000 _compiler.py:241(_optimize_charset)
     87/1    0.004    0.000   41.590   41.590 {built-in method builtins.exec}

Python 3.12

Checking uv subprocess ommands with perf_counter

Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'MarkupSafe', '--no-cache-dir']: 16.475093000000925
Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'cryptography>=35.0.0', '--no-cache-dir']: 16.486071899998933
Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'numpy', '--no-cache-dir']: 15.042250600003172
Execution time of ['uv', 'pip', 'install', 'mypy==1.9.0', 'urllib3>=2', '--no-cache-dir']: 9.922830200001044
Execution time of uv venv \.venv: 0.13397619999886956
Execution time of uv venv \.venv: 0.1747261000018625
Execution time of uv venv \.venv: 0.30994009999994887
Execution time of uv venv \.venv: 0.3153578999990714

Some preliminary cprofile results (look at those primitive calls!!!)

regr_test_py312.prof% strip
regr_test_py312.prof% sort tottime
regr_test_py312.prof% stats 20     
Wed Mar 13 16:27:19 2024    regr_test_py312.prof

         1377313 function calls (1316539 primitive calls) in 51.433 seconds

   Ordered by: internal time
   List reduced from 1284 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   120/44   17.097    0.142    3.751    0.085 {method 'get' of '_queue.SimpleQueue' objects}
17229/109   15.126    0.001    0.478    0.004 {built-in method _winapi.CopyFile2}
    23494    8.901    0.000    8.901    0.000 {built-in method nt.unlink}
     42/2    3.862    0.092    0.000    0.000 {method 'read' of '_io.TextIOWrapper' objects}
17229/14114    1.100    0.000    1.033    0.000 {built-in method nt._path_isdir}
1624/1388    0.605    0.000    0.530    0.000 {built-in method nt.utime}
     2155    0.523    0.000    0.523    0.000 {built-in method nt.rmdir}
1673/1265    0.400    0.000    0.355    0.000 {built-in method nt.mkdir}
    39702    0.369    0.000    0.866    0.000 <frozen ntpath>:107(join)
    88133    0.264    0.000    0.390    0.000 <frozen ntpath>:179(splitroot)
1624/1357    0.251    0.000    0.243    0.000 {built-in method nt.chmod}
   1624/2    0.235    0.000    0.334    0.167 shutil.py:479(_copytree)
     63/1    0.196    0.003   41.568   41.568 threading.py:995(_bootstrap)
3859/3688    0.187    0.000    0.210    0.000 {built-in method nt.scandir}
17229/101    0.182    0.000    1.182    0.012 shutil.py:427(copy2)
  2155/17    0.153    0.000    9.699    0.571 shutil.py:607(_rmtree_unsafe)
       25    0.142    0.006    0.142    0.006 {built-in method _winapi.WaitForSingleObject}
171649/171303    0.116    0.000    0.150    0.000 {built-in method nt.fspath}
1652/1497    0.113    0.000    0.110    0.000 {built-in method nt._path_exists}
   225912    0.100    0.000    0.101    0.000 {built-in method builtins.isinstance}

Avasam avatar Mar 13 '24 21:03 Avasam