mne-python icon indicating copy to clipboard operation
mne-python copied to clipboard

slow CI docs build

Open drammock opened this issue 2 years ago • 7 comments

Yesterday's overnight doc build timed out. A quick scan of the log output shows this:

generating gallery for auto_tutorials/clinical... [ 25%] 10_ieeg_localize.py
Clean 2539.8 s : 2.05 GB
Clean 2712.5 s : 2.04 GB

https://app.circleci.com/pipelines/github/mne-tools/mne-python/14833/workflows/ee1d9eb6-039a-4178-8395-88441df5bafb/jobs/45868?invite=true#step-139-400

IDK when the time/memory tracking was added, but if I'm interpreting it correctly it seems to mean that at least 172 seconds were spent on rendering that tutorial. @alexrockhill do you have time look into this? I'd start with a local doc build with PATTERN to see how long that tutorial takes locally, then examine recent changes to it to see what might be making it so slow.

drammock avatar Jun 23 '22 17:06 drammock

Huh, the only thing that has been changed lately is that the video was added. The dev version up right now says it takes 2 minutes and 47 seconds to execute. There are a lot of computations and components, I think it's going to take close to two minutes. I know @larsoner, optimized it most recently and I thought he got it down below 90 seconds when he did that if I'm not mistaken. Is 172 seconds causing the timeout or it's just the longest-executing tutorial?

alexrockhill avatar Jun 23 '22 17:06 alexrockhill

I remember something about this being about a paid circle version and discussions related to if we can keep our circle run under 10 minutes and deciding no and that it would need to be paid for. I'm happy to look into optimizing the ieeg tutorial but it's not clear to me why it's essential for the docs building in time, aren't we supposed to have more than ten minutes?

alexrockhill avatar Jun 23 '22 18:06 alexrockhill

Is 172 seconds causing the timeout or it's just the longest-executing tutorial?

I just scanned through the timings and found what looked like the biggest jump (the "longest files" summary didn't print, because the build job didn't finish). It's likely that this tutorial is not the sole cause of the timeout, but if it's taking nearly double what it was recently then that's worth looking into IMO, even if it doesn't "solve" the larger issue.

drammock avatar Jun 23 '22 19:06 drammock

I'm happy to look into it, but just checking that I think the larger issue is why we have a ten minute timeout when I thought we fixed that

alexrockhill avatar Jun 23 '22 19:06 alexrockhill

So on 0.24.1 (I think pre-optimization by @larsoner) it was just under two minutes https://mne.tools/0.24/auto_tutorials/clinical/10_ieeg_localize.html and on stable it's just under two minutes https://mne.tools/stable/auto_tutorials/clinical/10_ieeg_localize.html#sphx-glr-auto-tutorials-clinical-10-ieeg-localize-py. The only thing I've done recently is add the YouTube video but that seems like it would render almost instantly because it's just some javascript. I will profile it now.

alexrockhill avatar Jun 24 '22 14:06 alexrockhill

Hmmm it took 104 seconds for me locally and nothing seemed out-of-place. It might be a memory on Circle issue or maybe I can look into clearing memory better during execution...

alexrockhill avatar Jun 24 '22 14:06 alexrockhill

Yes this tutorial has always been the slowest at 2-3 minutes. I don't think this part is new. Any way to speed it up would be helpful!

Keep in mind to find slow examples you can look at a global list in a completed run like:

https://app.circleci.com/pipelines/github/mne-tools/mne-python/14880/workflows/bf0a0520-2c85-4db7-83d0-e7d7f5d685be/jobs/45967?invite=true#step-139-816

computation time summary:
    - ../tutorials/clinical/10_ieeg_localize.py:                              170.71 sec   1182.3 MB
    - ../tutorials/clinical/20_seeg.py:                                       105.45 sec    590.3 MB
    - ../tutorials/inverse/60_visualize_stc.py:                                74.52 sec    662.3 MB
    - ../tutorials/stats-sensor-space/75_cluster_ftest_spatiotemporal.py:      72.39 sec    128.7 MB
    - ../examples/visualization/brain.py:                                      69.05 sec     27.8 MB
    - ../tutorials/io/60_ctf_bst_auditory.py:                                  65.22 sec    572.1 MB
...

And our builds actually produce a "hidden" HTML file like this in each directory:

https://mne.tools/dev/auto_tutorials/clinical/sg_execution_times.html

larsoner avatar Jun 24 '22 17:06 larsoner