mne-python WIP: Enable line profiling during pytest

Todo:

[ ] Make tests pass when using the script (init_kwargs has problems currently)

Should help with #10704. Use cProfile -- adapting kernprof/line_profiler code -- to:

Run pytest
Print -- in order of most slow -- any mne functions that are 1) decorated with verbose (which is a lot of them) and 2) take at least 1 sec during the tests
For each printed function, print any lines that take >= 1% of the execution time

Locally I get for example this output, first what we have in main:

$ python tools/pytest_profile.py -m "not ultraslowtest" mne/tests/

Output in main (unchanged)

...
=================================================================== slowest 20 test modules ====================================================================
70.83s total   mne/tests/test_rank.py
63.40s total   mne/tests/test_epochs.py
44.12s total   mne/tests/test_chpi.py
36.02s total   mne/tests/test_cov.py
34.45s total   mne/tests/test_coreg.py

...

===================================================================== slowest 20 durations =====================================================================
12.34s call     mne/tests/test_chpi.py::test_calculate_chpi_coil_locs_artemis
9.89s call     mne/tests/test_dipole.py::test_min_distance_fit_dipole
8.99s call     mne/tests/test_chpi.py::test_calculate_chpi_positions_vv
6.70s call     mne/tests/test_chpi.py::test_calculate_chpi_snr
6.55s call     mne/tests/test_morph_map.py::test_make_morph_maps

...

==================================================== 3 failed, 1316 passed, 8 skipped in 425.58s (0:07:05) =====================================================

Followed by the new stuff:

Determining timing results...
************************************************************************************************************************
#1. estimate_rank: 51.49 s (8.13%)
In /Users/larsoner/python/mne-python/mne/rank.py @ 20
Line #      Hits         Time  Per Hit   % Time  Line Contents
    55       428   51460041.0 120233.7     99.9      s = linalg.svdvals(data)

************************************************************************************************************************
#2. compute_rank: 27.1928 s (4.29%)
In /Users/larsoner/python/mne-python/mne/rank.py @ 274
Line #      Hits         Time  Per Hit   % Time  Line Contents
   331        36     301137.0   8364.9      1.1              info = pick_info(info, [info['ch_names'].index(name)
   349       640    1813007.0   2832.8      6.7      picks_list = _picks_by_type(info, meg_combined=True, ref_meg=False,
   371       387    1062395.0   2745.2      3.9              proj_op, n_proj, _ = make_projector(info['projs'], ch_names)
   396       100     517988.0   5179.9      1.9                      data = np.dot(proj_op, data)
   397       200   14004223.0  70021.1     51.5                  this_rank = _estimate_rank_meeg_signals(
   398       100    1036220.0  10362.2      3.8                      data, pick_info(simple_info, picks), scalings, tol, False,
   407       181     299240.0   1653.3      1.1                          data = np.dot(np.dot(proj_op, data), proj_op.T)
   409       406    5392994.0  13283.2     19.8                      this_rank, sing = _estimate_rank_meeg_cov(
   410       203    1994200.0   9823.6      7.3                          data, pick_info(simple_info, picks), scalings, tol,

************************************************************************************************************************
#3. __init__: 21.3033 s (3.36%)
In /Users/larsoner/python/mne-python/mne/epochs.py @ 396
Line #      Hits         Time  Per Hit   % Time  Line Contents
   444      1108     234287.0    211.5      1.1                  self.drop_log = tuple(
   493       836    3273015.0   3915.1     15.4          info._check_consistency()
   496       835    7461610.0   8936.1     35.0          self.info = pick_info(info, self.picks)
   568       822    1945175.0   2366.4      9.1          self._reject_setup(reject, flat)
   581      1628    1713347.0   1052.4      8.0          self._projector, self.info = setup_proj(self.info, False,
   586       150    5240456.0  34936.4     24.6              self.load_data()  # this will do the projection
   593       106     363208.0   3426.5      1.7                  self._data[ii] = np.dot(self._projector, epoch)
   598       814     515145.0    632.9      2.4          self._check_consistency()

...

************************************************************************************************************************
#92. compute_vol_morph_mat: 1.03483 s (0.16%)
In /Users/larsoner/python/mne-python/mne/morph.py @ 471
Line #      Hits         Time  Per Hit   % Time  Line Contents
   505         4    1034560.0 258640.0    100.0          self.vol_morph_mat = self._morph_vols(None, 'Vertex')

************************************************************************************************************************
#93. filter: 1.02517 s (0.16%)
In /Users/larsoner/python/mne-python/mne/filter.py @ 1922
Line #      Hits         Time  Per Hit   % Time  Line Contents
  2008       146      88870.0    608.7      8.7          update_info, picks = _filt_check_picks(self.info, picks,
  2023       174     928367.0   5335.4     90.6              filter_data(

I think across these three types of profiling output, it might be easier to find slow things to optimize.

I think this could probably be merged as-is and iterated over. The only change to core code is a try/except that should add negligible time (I hope).

Jun 13 '22 22:06 larsoner

This is great @larsoner!

Jun 14 '22 04:06 cbrnr

Amazing!

Jun 14 '22 06:06 hoechenberger

How long does the line profiling take? Just to make sure that we don't add much overhead...

Jun 14 '22 07:06 cbrnr

neat ! @larsoner did you take this from somewhere? I think this would be very useful in the wider ecosystem.

Jun 14 '22 08:06 agramfort

How long does the line profiling take? Just to make sure that we don't add much overhead...

The overhead seems to be negligible in testing so far

@larsoner did you take this from somewhere? I think this would be very useful in the wider ecosystem.

Not really, I took the printing bits from line_profiler/kernprof but under the hood it's really just using cProfile to do the magic. I agree something like this could be more broadly useful, but this PR takes advantage of something that is unique to us, namely that we already decorate a lot of our slow stuff with the @verbose decorator, so I can just inject what is essentially @profile wrapping that way. To make it really broadly useful elsewhere, you'd have to do some namespace inspection + rewriting magic that would almost certainly be brittle and cause problems in tests.

This PR already breaks some stuff like our _init_kwargs business for BaseRaw (at least it does in my local testing), and I can fix that. But any additional/fancier magic I'd expect to create additional problems for other repos that would be tougher to work around.

TL;DR: I agree it would be nice but it would be difficult to make it more broadly applicable I think

Jun 14 '22 15:06 larsoner

@larsoner this has all green CIs but you've recently marked as WIP. Is there more to do here?

Jun 28 '22 16:06 drammock

Yeah, not all tests pass when running in profile mode, which I'd like to fix

Jun 28 '22 17:06 larsoner

Thanks to @cbrnr in https://github.com/mne-tools/mne-qt-browser/issues/161#issuecomment-1489779902 I think we don't need this. Profiling on OSX is kind of a pain because you need sudo, but after doing sudo -i I can do:

$ root# HOME=/Users/larsoner MNE_DATASETS_TESTING_PATH="/Users/larsoner/python" py-spy top -s -r 1000 -- pytest mne/viz/
2.04s behind in sampling, results may be inaccurate. Try reducing the sampling rate.
Collecting samples from '/Users/larsoner/opt/miniconda3/envs/mne/bin/python3.10 /Users/larsoner/opt/miniconda3/envs/mne/bin/pytest mne/viz/' and subprocesses
Total Samples 444350
GIL: 76.80%, Active: 90.30%, Threads: 7, Processes 2

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)                                                                                                   
  0.00%   0.00%   39.62s    141.2s   _call_with_frames_removed (<frozen importlib._bootstrap>:241)
  0.00%   0.00%   13.00s    13.00s   _compile_bytecode (<frozen importlib._bootstrap_external>:672)
  0.00%   0.00%    9.56s     9.56s   _compute_over (mne/viz/_3d_overlay.py:99)
  1.90%   2.00%    8.62s    25.53s   _process_events (mne/viz/backends/_pyvista.py:1105)
  0.00%   0.00%    8.27s     8.27s   svd (scipy/linalg/_decomp_svd.py:127)
  0.60%   0.60%    8.13s     8.13s   _path_stat (<frozen importlib._bootstrap_external>:147)
  0.10%   0.10%    7.64s     7.64s   _update_alg (pyvista/core/filters/__init__.py:36)
 34.30%  34.30%    7.44s     7.45s   _init_mne_qtapp (mne/viz/backends/_utils.py:165)
  0.00%   0.00%    6.71s     6.71s   _compute_over (mne/viz/_3d_overlay.py:98)
  0.00%   0.00%    6.58s     6.58s   _compute_over (mne/viz/_3d_overlay.py:101)
 ...

With nice config options, line-by-line vs function-level, total-including-children versus owntime, etc. This should be more easily doable on Linux and Windows, too. So I'll close this!

Mar 30 '23 16:03 larsoner

mne-python mne-python copied to clipboard

WIP: Enable line profiling during pytest

mne-python
mne-python copied to clipboard