spikeinterface
spikeinterface copied to clipboard
mountainsort runs far slower on spikeinterface
An interesting observation I've made while sorting a relatively small recording is an enormous processing time difference between using mountainsort installed independently (this version), vs running it through spike interface having installed mountainsort4 via pip. Some background about the recording.
32 channel laminar arrays recording. Signals gathered at 30KHz. Analysis of 1 hour worth of recording.
Running mountainsort by itself on the data, bandpass filtering, whitening, and running the main sort algorithm takes 418 seconds. When I use mountainsort through spikeinterface, it takes 3078 seconds (~7.3x the time) to do the same steps.
Both sorters are reading from the same .mda file, the settings are all the same (threshold, detect interval, adjacency radius, number of workers).
@magland do you have any idea why this would be?
Strange. The wrapper in spikeinterface is really a thin layer on top of moutainsort. See https://github.com/SpikeInterface/spikeinterface/blob/master/spikeinterface/sorters/mountainsort4/mountainsort4.py#L90
Is it due to filetring and whitening done in spikeinterface here : https://github.com/SpikeInterface/spikeinterface/blob/master/spikeinterface/sorters/mountainsort4/mountainsort4.py#L100
@rtraghavan : could you try to put some
t0 = time.perf_counter()
...
t1 = time.perf_counter()
print(t1 - t0)
in the _run_from_folder
of this wrapper to check where the time is lost ?
@samuelgarcia I think @rtraghavan is referring to differences between mountainsort3 and mountainsort4. Maybe @magland comment on it :)
It's true that these versions have different implementations. I am surprised that mountainsort4.py is significantly slower. During the iterations, does it say how many channels are in each neighborhood? @rtraghavan
Apologies for the delay. On later inspection it seems both sorters were not using the maximum available number of workers. Moreover I realize from the earlier comments that total run time in spike interface should be compared to the sum of filtering, whitening, and sorting in mountainsort-js. Setting them equal to one another and rerunning the code I get the following.
mountainsort-js: 363 sec spikeinterface mountainsort4: 1051 sec
Still a ~3 fold difference between the two. Is this closer to what you'd expect @magland
@magland During iterations of mountainsort-js and mountainsort4, they report 3 channels within each neighborhood for both recordings.
I should be clear @alejoe91, this is not mountainsort3 vs mountainsort4. The algorithm for mountainsort-js is ml_ms4alg.
Strange... if they are both using ml_ms4alg (mountainsort4 repo alg is no different from ml_ms4alg alg), then I would not expect a difference in timing.
maybe this discussion should be ported to the mountainsort4
repo