spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Total memory limitations for PCA

Open b-grimaud opened this issue 9 months ago • 1 comments

As request, moving this over from discussions.

This was discussed a bit more in #3722 : computing PCA metrics on analyzers with a lot of units and a lof of channels, using a high n_jobs can sometime pre-allocate a very large array that does not fit in memory.

The current job_kwargs that apply to memory limitation are, as far as I understand, only relevant when dealing with recordings.

The current fix in #3721 is to automatically throttle n_jobs to stay within memory constraints. Something else that was mentionned would be to allocate a single array to be used by multiple workers, but I believe there were concerns about concurrent access.

b-grimaud avatar Mar 25 '25 16:03 b-grimaud

Yeah there have been discussions to more fully incorporate job_kwargs into PCA as well, but I don't think anyone has had the time to do that. I don't typically work that low level in SI, so I wouldn't be able to provide pointers on this. I have noticed with our current setup for my own experiments (our group uses Windows machines) that setting n_jobs to 1 for PCA actually provides a boost. So I think we would need a deeper discussion about this overall! Just still have a lot of projects being worked on across the group.

zm711 avatar Mar 26 '25 20:03 zm711