spikeinterface
spikeinterface copied to clipboard
Mystery error while running SC2
SC2 is crashing mid-run, without giving me any indication why.
Preprocessing the recording (bandpass filtering + CMR + whitening)
noise_level (no parallelization): 100%|███████████████████████████████████████████████████████████████████████████████| 20/20 [00:10<00:00, 1.95it/s]
Error running spykingcircus2
It was working fine until I replaced hard-coded parameters in the call with variables, so I could explore the parameter space a bit and find what works for me. So I suspect the problem is a parameter value, but their values in this test run should be the same as the hard-coded ones I was using before.
I would like to recommend that some sanity-checking of parameters is carried out, so that an error message like this can say something useful about where to troubleshoot.
This is my sorter call:
joblist = [
{
'sorter_name': 'spykingcircus2',
'recording': recordings[row.Well_ID],
'output_folder': os.path.join(params.sortedPath, params.experiment, 'spykingcircus2', row.Well_ID),
'remove_existing_folder' : True,
'verbose' : True,
'raise_error' : False,
'general' : {'ms_before': 2, 'ms_after': 2, 'radius_um': params.GNrad},
'sparsity' : {'method': params.SPmethod, 'amplitude_mode': 'peak_to_peak', 'radius': params.SPrad, 'threshold': params.SPthresh},
'filtering': {'freq_min': params.BPmin, 'freq_max': params.BPmax, 'ftype': params.BPtype, 'filter_order': params.BPorder, 'margin_ms': params.BPmargin},
'whitening': {'mode': 'local', 'regularize': False},
'detection' : {'peak_sign': 'neg', 'detect_threshold': params.DTthresh},
'selection': {'method': params.SLmethod, 'n_peaks_per_channel': params.SLnppc, 'min_n_peaks': params.SLmnp, 'select_per_channel': False},
'apply_motion_correction' : False,
'motion_correction': {'preset': 'dredge_fast'},
'merging': {'similarity_kwargs': {'method': 'cosine', 'support': 'union', 'max_lag_ms': params.MGlag},
'correlograms_kwargs': {},
'auto_merge': {'min_spikes': params.MGminspikes, 'corr_diff_thresh': params.MGcorrthresh}},
'clustering' : {'legacy': params.CLlegacy},
'matching' : {'method': 'circus-omp-svd'},
'apply_preprocessing' : True,
'matched_filtering' : True,
'cache_preprocessing': {'mode': 'memory', 'memory_limit': params.resources, 'delete_cache': True},
'multi_units_only' : False,
'job_kwargs' : {'n_jobs': params.cores}, ## does not seem to obey global_job_kwargs. Defaults to multithreading and that crashes sc2, so should always be set to 1.
'debug' : True
}
for row in well_pool.itertuples()
if row.Well_ID in list(recordings.keys())
]
Ignore the list comprehension, there is only one recording in this test run.
The values of these parameters for this run are all either values I have previously used successfully, or the default values mentioned in the docs.
Namespace(experiment='DoseRes', row=82, maxrow=82, dataPath='/SCRATCH/mea/data/', preproPath='/SCRATCH/mea/preprocessed/', sortedPath='/SCRATCH/mea/sorted/', cores=1, memory='10G', resources=0.5, tasks=1, sorter='spykingcircus2', BPmin=100, BPmax=3500, BPtype='bessel', BPorder=1, BPmargin=5.0, Wmode='local', GNrad=250, SPmethod='snr', SPrad=20, SPthresh=0.25, DTmethod='locally_exclusive', DTthresh=5, DTsweep=0.1, SLmethod='uniform', SLnppc=5000, SLmnp=100000, MGlag=0.2, MGminspikes=10, MGcorrthresh=0.25, CLlegacy=False, hdf5='/SCRATCH/REFERENCES/maxwell_plugin')
PS. There seems to be disagreement between get_default_sorter_params('spykingcircus2')and get_sorter_params_description('spykingcircus2') about what some of the defaults actually are, for example for the selection dict.
I am using SI version 0.102.0
Trying to isolate the offending parameter/dict, the error sometimes pops up later in the runtime:
Preprocessing the recording (bandpass filtering + CMR + whitening)
noise_level (no parallelization): 100%|███████████████████████████████████████████████████████████████████████████████| 20/20 [00:10<00:00, 1.95it/s]
write_memory_recording (no parallelization): 100%|██████████████████████████████████████████████████████████████████████| 2/2 [03:01<00:00, 90.87s/it]
detect peaks using locally_exclusive + 1 node (no parallelization): 100%|███████████████████████████████████████████████| 2/2 [00:12<00:00, 6.12s/it]
detect peaks using matched_filtering (no parallelization): 100%|███████████████████████████████████████████████████████| 2/2 [06:33<00:00, 196.85s/it]
Kept 340034 peaks for clustering
extracting features (no parallelization): 100%|█████████████████████████████████████████████████████████████████████████| 2/2 [00:27<00:00, 13.92s/it]
/SOFTWARE/miniforge3/envs/new_mea_env/lib/python3.12/site-packages/sklearn/utils/deprecation.py:151: FutureWarning: 'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.
warnings.warn(
estimate_templates (no parallelization): 100%|██████████████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.76s/it]
Error running spykingcircus2
I don't think the deprecation warning has anything to do with this, I think I see it even on successful runs.
Can you try #3847 to see if the error is still present in this updated version? There was a bug in SC2 related to latest SI release, but not sure this is related here
I have SI installed through Conda, so I'll see what I can do...
In the meantime, the problem seems to come from my trying to change methods and control more parameters for the sparsity and detection dicts. More dicts may be affected but I haven't got to them yet. Clearly some values or parameter combinations are not valid.
But it is very difficult to piece together what parameters are available and what options are available for them. As sorters are re-implemented with SI lower level functions, the relevant info gets buried away. The docs for run_sorter_jobs only mention that SC2 is an option. get_default_sorter_params('spykingcircus2') and get_sorter_params_description('spykingcircus2') don't offer a full explanation of the various dicts. I only found by accident through frustrated googling that the dicts correspond to sortingcomponents functions. And there again things can be incomplete, and again by accident only I found out that some things are explained in core and by that point they often look different too.
I think it would be very useful to ensure that all those docs are linked across layers, so those of us who are not developers of the package and don't know how the higher level methods are implemented can still find the relevant info for all the possible parameters.
Yes, I agree, sorry for the mess here. The sortingcomponents were experimental for a while, and we are now currently in the process of writing the paper, and thus proper documentation. @alejoe91 @samuelgarcia I'm willing to write some docs for the components and the internal sorters, but where would the appropriate place be?
I appreciate the work you are doing, it is a big project in progress, and I know documentation is not a small task at all.