Compute principal components slow on Windows
#3249
Based on @chrishalcrow testing computing PCA on windows is an extremely slow step in our testing. I know the current implementation goes straight to ProcessPoolExecutor so maybe we need to revisit this and I can test locally on Windows? @alejoe91 ?
Thanks for writing this up @zm711.
I think that the problem could also be an interaction between processes and threads. Sklearn will by default try to max out the number of threads, but we add our layer of process parallelization. In the ChunkRecordingExecutor we hav an additional max_threads_per_process arg, but the machinery is a bit more complicated. I think we should give it a try and see if it fixes the issue
Let me link this where Chris saw this happening on his Windows machine too for newer versions of sklearn and not older. https://github.com/SpikeInterface/spikeinterface/issues/2817
But that could be cool if it speeds things up on Windows since that is a big workflow and testing bottle neck. I haven't dug deeply into the PCA code to see how complicated it would be :)
Please do try to reduce the execution time of the PCA computation in the export to Phy process. That will be a significant help to the compute pipeline on Windows machines. Thank you.
We have found some improvements on Windows running n_jobs=1 ie shut off mutliprocessing for PCA only.
i was having a probem with all analyzer extensions being really slow except the random spikes. turning n_jobs=1, and max_threads_per_worker to the maxim solved it for me.