spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Difference in clusters sorted by spykingcircus2 and mountainsort5

Open kshtjkumar opened this issue 1 year ago • 8 comments

Hi , I have a question regarding the significant disparity in the number of clusters identified by MountainSort5 and SpykingCircus2. I have a tetrode recording that lasts for 1277 seconds. When I use SpykingCircus2 for sorting, I get 153 clusters, of which 78 have an ISI violation ratio below 0.5. In contrast, the same recording sorted with MountainSort5 results in only 4 clusters, with just 1 cluster exceeding the ISI threshold.

What could be causing such a large difference in the number of clusters?

here is what the probe looks like: Figure 63

this is the raster form the mountainsort5: Figure 65

kshtjkumar avatar Jun 18 '24 21:06 kshtjkumar

I don't know which version of spikeinterface you are using, but lots of improvements have been made to circus2 in the main branch. Since it is unlikely that you can find so many clusters on few channels, either there is a mismatch in the parameters, or something weird with the data for sc2. Are they properly filtered/preprocessed? Are you setting apply_preprocessing=False if you have preprocessing the data with your own filters?

yger avatar Jun 19 '24 12:06 yger

Also I'll tag @magland, so he can comment if he wants about MS5 stuff.

zm711 avatar Jun 19 '24 14:06 zm711

I don't know which version of spikeinterface you are using, but lots of improvements have been made to circus2 in the main branch. Since it is unlikely that you can find so many clusters on few channels, either there is a mismatch in the parameters, or something weird with the data for sc2. Are they properly filtered/preprocessed? Are you setting apply_preprocessing=False if you have preprocessing the data with your own filters?

I am using v0.100.6, I am passing the preprocessed data to the sc2 sorter, it is re-referenced, notch filtered and bandpassed(300-6000).

kshtjkumar avatar Jun 19 '24 16:06 kshtjkumar

@kshtjkumar I would play with the ms5 parameters. I'm thinking especially about the detect threshold.

magland avatar Jun 19 '24 18:06 magland

@kshtjkumar I would play with the ms5 parameters. I'm thinking especially about the detect threshold.

Could you please suggest on how to go with that ?

kshtjkumar avatar Jun 19 '24 20:06 kshtjkumar

@kshtjkumar I would play with the ms5 parameters. I'm thinking especially about the detect threshold.

Could you please suggest on how to go with that ?

Could you provide the code/script you are using to run this? Then we can show how to adjust the parameters.

magland avatar Jun 20 '24 10:06 magland

@kshtjkumar I would play with the ms5 parameters. I'm thinking especially about the detect threshold.

Could you please suggest on how to go with that ?

Could you provide the code/script you are using to run this? Then we can show how to adjust the parameters.

sure:

recording_ecog = spre.bandpass_filter(recording_resampled_ecog, freq_min=300, freq_max = 6000 ) #bandpass filter
recording_notch_ecog = spre.notch_filter(recording_ecog,q = 50)  #notch_filter 
rec_ecog_ref = spre.common_reference(recording_notch_ecog, operator="median", reference="global")  #rereferencing the data 
rec_normed = spre.zscore(recording=rec_ecog_ref)
rec_2 = spre.whiten(rec_normed) 
sorting_rec = ms5.sorting_scheme2(rec_2,
            sorting_parameters=ms5.Scheme2SortingParameters(
                phase1_detect_channel_radius=150,
                detect_channel_radius=50,
                training_duration_sec=60
            )
        )
print(output_folder)
print("Sorter found", len(sorting_rec.get_unit_ids()), "units")
sorting_rec = sorting_rec.remove_empty_units()
print("Sorter found", len(sorting_rec.get_unit_ids()), "non empty units")

kshtjkumar avatar Jun 25 '24 07:06 kshtjkumar

The parameter to add to Scheme2SortingParameters is detect_threshold. You can see ms5 parameters here

https://github.com/flatironinstitute/mountainsort5/blob/main/docs/scheme2.md

And the defaults are here

https://github.com/flatironinstitute/mountainsort5/blob/main/mountainsort5/schemes/Scheme2SortingParameters.py

magland avatar Jun 25 '24 12:06 magland

The parameter to add to Scheme2SortingParameters is detect_threshold. You can see ms5 parameters here

https://github.com/flatironinstitute/mountainsort5/blob/main/docs/scheme2.md

And the defaults are here

https://github.com/flatironinstitute/mountainsort5/blob/main/mountainsort5/schemes/Scheme2SortingParameters.py

I've experimented with various thresholds ranging from 4 to 6, but I'm unsure how to determine the optimal threshold. I could continue iterating indefinitely, but what criteria should I use to make the final decision? I'm struggling to grasp the correct approach for this.

kshtjkumar avatar Jul 07 '24 22:07 kshtjkumar

The parameter to add to Scheme2SortingParameters is detect_threshold. You can see ms5 parameters here https://github.com/flatironinstitute/mountainsort5/blob/main/docs/scheme2.md And the defaults are here https://github.com/flatironinstitute/mountainsort5/blob/main/mountainsort5/schemes/Scheme2SortingParameters.py

I've experimented with various thresholds ranging from 4 to 6, but I'm unsure how to determine the optimal threshold. I could continue iterating indefinitely, but what criteria should I use to make the final decision? I'm struggling to grasp the correct approach for this.

@kshtjkumar Unfortunately it's very difficult to know what parameters are best for a particular type of data. I think you'll need to rely on your visual inspection of the spike sorting outputs (correlograms, spike trains, spike amplitudes, waveforms, etc)

magland avatar Jul 10 '24 16:07 magland

@magland do you think changing the training time for the algorithm will help ? lets say my total recording duration is of 120mins and among the parameters for scheme 2 I set training_duration_sec = 120mins as well. Will it help in better sorting ?

kshtjkumar avatar Aug 09 '24 22:08 kshtjkumar

@magland do you think changing the training time for the algorithm will help ? lets say my total recording duration is of 120mins and among the parameters for scheme 2 I set training_duration_sec = 120mins as well. Will it help in better sorting ?

Possibly. I haven't tried varying that parameter too much.

magland avatar Aug 10 '24 00:08 magland

@kshtjkumar I m curious about your neurointerface. Is it an ECoG or a single-unit probe? If it is an ECoG, the electrodes contact receives a summation of many neural activities, is it plausible to sort single units from the recordings?

David-H-Chang avatar Aug 23 '24 13:08 David-H-Chang

hi @DaohanZhang its a depth probe for single units, this particular case above is for the tetrode (4 ch depth probe).

kshtjkumar avatar Aug 24 '24 19:08 kshtjkumar