spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

parallel processing with silence_periods

Open JeffreyBoucher opened this issue 5 months ago • 31 comments

Hello,

In order to guard against a few periods of time in which some recordings of mine were saturated, I've:

  • written and applied a function to detect these periods
  • applied silence_periods with the "noise" function to them and the surrounding milliseconds.
  • noted these periods for later exclusion from analysis

This process works quite well. However: silence_periods is quite slow when I don't run it parallel. The outer layer of this function does not appear to take an n_jobs kwarg; when I go into silence_periods.py myself and add the kwarg to get_noise_levels, it works, but the parallel pool doesn't properly close afterward, which leads to bugs later when I try to parallel process something else...

So, this post is either a feature request or a question as to how to do this properly if I am not. Is there a way to insist on parallel processing for silence_periods? If not, could you please implement this for future releases? Additionally: is there a spikeinterface-internal way I should be closing the parallel pool after I start one like this?

Thanks!

Jeff Boucher

JeffreyBoucher avatar Jul 07 '25 12:07 JeffreyBoucher

@samuelgarcia I think you're best to explain this !

zm711 avatar Jul 07 '25 12:07 zm711

Hi Jeffrey. I am not sure to understand the question. This silence_periods return a lazy recording that compute on the fly the masking with noise around a trigger list. In short the get_trace will trigger the "computation" like any other recording.

The parralell will happen if you do a recording.save(..., n_jobs=10) or detect_peak on it or any other operation that scan and read by chunk the entire recording.

samuelgarcia avatar Jul 09 '25 10:07 samuelgarcia

Hello,

My understanding of SI is that how you describe things is how it should work. However, it's not how it works in my code. When I run the line: " recording = si.silence_periods(recording,windowsToSilenceArray, mode="noise") " I get the following output from my terminal:

noise_level (no parallelization): 100%|██████████| 20/20 [00:13<00:00, 1.50it/s]

I can parallelize this in the way I described above.

In addition to this and perhaps more importantly (though I can't confirm because of a sudden IT problem on the relevant computer), I believe this process occurs when I run "si.concatenate recordings", another function which does not take an n_jobs argument. I could also speed this up the way I described, but then had the problem with the pool not closing properly...

Anyway... If you can reproduce this, maybe the issue is that the silence_periods function is not properly lazy?

Also, for the record, I am using SI version 102.3

Thanks for your help!

Jeff Boucher

JeffreyBoucher avatar Jul 09 '25 11:07 JeffreyBoucher

It looks to me like we are running into this here: https://github.com/SpikeInterface/spikeinterface/blob/68c8d00336af2625cccc13c1a28f9b47d4c7c932/src/spikeinterface/preprocessing/silence_periods.py#L72-L78 and this function takes in job_kwargs

https://github.com/SpikeInterface/spikeinterface/blob/68c8d00336af2625cccc13c1a28f9b47d4c7c932/src/spikeinterface/core/recording_tools.py#L697-L704

I think one way we could test this @JeffreyBoucher is could you set global job_kwargs and see if silence_periods speeds up and the pool doesn't get broken. If that is the case then we should probably appropriately expose these kwargs in the case someone gets noise levels. To set global job kwargs do:


si.set_global_job_kwargs(n_jobs=xx)

zm711 avatar Jul 09 '25 12:07 zm711

Okay, I will give this a try and let you know (as soon as the IT problem is solved)

Thanks!

JeffreyBoucher avatar Jul 09 '25 12:07 JeffreyBoucher

Hello!

The IT problem was solved so I was able to try setting the global. This seems to be working quite well, thanks! This solves my short-term problem, though I still suspect that this is non-lazy behavior you'll want to fix...

Also, I was able to (fail to) confirm that si.concatenate recordings was my other problem. It's not: it was the function next to it, set_probe, run on the concatenated recording object.

Thanks for your help!

Jeff Boucher

JeffreyBoucher avatar Jul 09 '25 15:07 JeffreyBoucher

Spoke a bit too soon I guess, oddly enough when I run detect_peaks just after all this I can neither set an n_jobs manually nor use a global argument... The problem specifically is with noise_level again, internal to this. I don't really understand this, but this used to be the stage where I had problems with my parallel pool not closing... I'll let you know if I can finagle something.

JeffreyBoucher avatar Jul 09 '25 15:07 JeffreyBoucher

I will say getting noise levels is not a lazy process as it requires running to the actual data. So this does break the idea of pure laziness. I don't think this is being applied eagerly to the recording itself at this stage (ie the recording object is still lazy), but getting the noise chunks is not.

Spoke a bit too soon I guess, oddly enough when I run detect_peaks just after all this I can neither set an n_jobs manually nor use a global argument... The problem specifically is with noise_level again, internal to this. I don't really understand this, but this used to be the stage where I had problems with my parallel pool not closing... I'll let you know if I can finagle something.

yeah definitely let us know. Also could you let us know your OS and whether you're doing cluster (files + cpus remote) vs accessing data on a network (ie just files are remote analysis local)?

zm711 avatar Jul 09 '25 15:07 zm711

Getting things like this sometimes too:

Image

JeffreyBoucher avatar Jul 09 '25 15:07 JeffreyBoucher

The operating system I am using is Windows 11, with the IDE pycharm.

I'm using a local computer, not a cluster. The files on are a NAS.

Thanks!

Jeff Boucher

JeffreyBoucher avatar Jul 09 '25 15:07 JeffreyBoucher

Alright, getting back into the SI code internals a bit...

I don't know the circumstances in which si.set_global_job_kwargs does or does not apply, but in the peak_detection code (within check_params), eventually this line is reached here, involving get_noise_levels again:

Image

The random_chunk_kwargs are empty for me. Earlier in the code job_kwargs and method_kwargs are split:

Image

The method_kwargs do transfer down to check_params, but only the detect_threshold argument (for me) is kept.

I think I might be able to hack a solution here in a few different ways, each of which might break the pool for me again... It seems odd to me that si.set_global_job_kwargs wouldn't work here; do you know why that might be?

Thanks for your help

JeffreyBoucher avatar Jul 09 '25 16:07 JeffreyBoucher

Dug a bit deeper... The first time that fix_job_kwargs is called within this function, it properly extracts the value I set (15). Every other time it extracts some different globals which includes n_jobs of 1. I'll try to figure out when exactly this shift occurs...

JeffreyBoucher avatar Jul 09 '25 16:07 JeffreyBoucher

Final set of info before I call it for the night:

I got this warning which may or may not be relevant just before it occured:

"RuntimeWarning: 'spikeinterface.core.job_tools' found in sys.modules after import of package 'spikeinterface.core', but prior to execution of 'spikeinterface.core.job_tools'; this may result in unpredictable behaviour"

The occurrence, also, happened in this section of the "process.py" file of concurrent

Image

I do think I can solve my short term problems by forcing n_jobs = 15 somewhere after it changes, but let me know if this information helps develop a less hacky solution

Thanks!

Jeff Boucher

JeffreyBoucher avatar Jul 09 '25 17:07 JeffreyBoucher

Thanks I think something good be getting messed up with global variable. I haven't tried silence_periods myself to know why that would be break things. Could you at the end of a script try doing get_global_job_kwargs? So we could see if we are incorrectly grabbing the kwarg or if we are actually overwriting the kwarg at some point?

zm711 avatar Jul 09 '25 20:07 zm711

Hello!

At each stage in this I was actually running get_global_job_kwargs, sorry if that wasn't clear; the point it switches is that results = super() bit (or internal to it I guess)

JeffreyBoucher avatar Jul 10 '25 07:07 JeffreyBoucher

I misunderstood. I thought you just added some prints into the code while you were running it. I think we can try to reproduce this with some simulated recordings. I use Mac and Windows so I can try to reproduce there and we can get someone else on the team to test on Linux to make sure this isn't OS specific. I'm a little busy this week so I can't promise it'll be done this week, but it is definitely on the todo list.

zm711 avatar Jul 10 '25 11:07 zm711

Thanks,

The way I've ended up moving past my bottleneck was just making a pickle after it was done running (so I can continue figuring out drift correction parameters). Eventually when I start sorting all my sessions I would love this all to go faster, but I shouldn't need it imminently.

Thanks for your help!

Jeff Boucher

JeffreyBoucher avatar Jul 10 '25 11:07 JeffreyBoucher

Hi, I will try to have a deeper into this. Indeed, sometimes get_noise_level() which now use parralelistion (it was not teh case 1 year ago) is in the init of a class and so the local job_kwargs are not propagated. The only way to set glocal job_kwargs. But in your case, your set gloab job_kwargs, so this should work...

samuelgarcia avatar Jul 10 '25 16:07 samuelgarcia

@JeffreyBoucher could you submit the script you're testing. If we are going to try to mimic this on our end I want to make sure we are using the same settings as you.

recording = si.read_xx(path/to/my/data)
rec_f = si.bandpass_filter(...)

etc with the actual arguments except your data (I would like to know what kind of file though)

zm711 avatar Jul 14 '25 12:07 zm711

Hello,

I'll try and reproduce the error with a contained version tomorrow.

Thanks,

Jeff Boucher

JeffreyBoucher avatar Jul 14 '25 18:07 JeffreyBoucher

spikesort_concatenated.txt

Hello,

I was able to contain the problem in a single .py file (changed to .txt so I could place it here). Requirements are pathlib, numpy 1.26.4, spikeinterface 102.3.

Let me know if you are able to replicate!

To clarify: you should see that noise_level is properly parallelized within spikeglx_preprocessing, but not detect_peaks.

Thanks,

Jeff Boucher

JeffreyBoucher avatar Jul 15 '25 12:07 JeffreyBoucher

Sweet I'll work on testing that. I'll start by testing the exact same script from main. We will be doing a release soon (ish) so if this is fixed in main I think the advice would be to either install from main or wait for the next release. I don't think we've played with this too much so I assume the bug will still be there.

zm711 avatar Jul 15 '25 12:07 zm711

I figured out the most immediate issue (on main). For @samuelgarcia we are splitting the method_kwargs and job_kwargs at the beginning of the node pipeline, but only submitting the method_kwargs to the first node of the pipeline. Because we are submitting empty kwargs rather than no kwargs we default back to our defaults rather than the global set by the user. Is there a reason why we don't allow all nodes have access to the job_kwargs?

Visually, https://github.com/SpikeInterface/spikeinterface/blob/6bbe42d3a3d439c92d13781c96bdc71586df43a4/src/spikeinterface/sortingcomponents/peak_detection.py#L109-L116

method_class is to get the noise levels, but no job_kwargs

https://github.com/SpikeInterface/spikeinterface/blob/6bbe42d3a3d439c92d13781c96bdc71586df43a4/src/spikeinterface/core/recording_tools.py#L762C8-L776

in the noise we split the kwargs but because we don't give job_kwargs the split leads to an empty dict which is passed on. This empty dict overrides the global_job_kwargs set.

I'm repeating your info (@JeffreyBoucher) for the most part but with the added explanation of why the kwargs are not propagated (@samuelgarcia will understand).

zm711 avatar Jul 15 '25 17:07 zm711

@JeffreyBoucher, this is a pretty deep bug so thanks a bunch for finding this. We may need to discuss this at an actual meeting to figure out the best fix (especially since this could appear in other places within sorting components). At the moment there's nothing you can do about it, but we will actively work on fixing it. Sorry!

As for the silence periods I think we will have to discuss exposing job_kwargs there too. But for now that function should be fixed by the global_job_kwargs. So we fixed part of the problem!

zm711 avatar Jul 15 '25 17:07 zm711

Hello!

Thanks for your help in fixing it; please update me if you ever resolve this in a future update, and I'll know to download it!

In the meantime I guess it's either waiting or seeing if replacing these periods with silence instead of noise works (I doubt it but never tried it)

Thanks!

Jeff Boucher

JeffreyBoucher avatar Jul 15 '25 19:07 JeffreyBoucher

Actually I've dug even deeper into this and this is a problem with propagating the noise_levels property with our silence_periods. So this bug happens because our node_pipeline loses access to the noise_levels (just for silence_periods--so you could try this with common_reference and it would work fine) and so the first step of the node pipeline checks if noise exists--because it no longer does each worker then has to recalculate the noise and does not get access to the global_job_kwargs. So I think we actually need to fix this at the silence_period levels.

During most of the detect_peaks my simulated recording (+ filter + CMR) shows up as this:

GroundTruthRecording (CommonReferenceRecording): 64 channels - 30.0kHz - 1 segments
                      18,000,000 samples - 600.00s (10.00 minutes) - float32 dtype - 4.29 GiB

despite the fact I had run silence_periods on it. This works fine for the mutliprocessing. Then when it goes to actually run_nodepipeline if I check the recording type I get

GroundTruthRecording (SilencedPeriodsRecording): 64 channels - 30.0kHz - 1 segments
                      18,000,000 samples - 600.00s (10.00 minutes) - float32 dtype - 4.29 GiB

in addition I lose the noise_levels_mad_raw at this point. Which forces the pipeline to recompute noise level without multiprocessing.

This is more stream of consciousness so that I have notes for our meeting to fix this.

zm711 avatar Jul 16 '25 17:07 zm711

@yger Sam said you should look at this since we need to see if this is occurring at the level of the node_pipeline or at the level of the silence_periods.

zm711 avatar Jul 16 '25 18:07 zm711

I think this is because the recording pass to the worker do not contain minimal propeties like the noise levels. We should have options for this. I will check.

samuelgarcia avatar Jul 16 '25 20:07 samuelgarcia

I will check also

yger avatar Jul 17 '25 07:07 yger

I've tried a putative fix #4071. To me, the problem is simply that we should propagate explictly job_kwargs to the signature of silence_periods, because otherwise we are always relying on default. Maybe this is the case in other preprocessing functions, I'll double check. Quite often with @samuelgarcia , we are setting default once for all at the beginning of the script, and not really playing on a per function basis with job_kwargs, thus maybe some signatures might miss the arguments.

yger avatar Jul 17 '25 07:07 yger