error: ValueError: n_samples=5 should be >= n_clusters=6 -> Spike sorting failed.
Describe the issue:
I use kilosort4 to sort a 64 channel data. It comes an error said n_samples is too small. Is that means my data is not enough to sort? How can I fix it?
Reproduce the bug:
No response
Error message:
Traceback (most recent call last):
File "/home/wangxy/wxy_sorting_curation.py", line 68, in <module>
aggregate_sorting = si.run_sorter_by_property(sorter_name='kilosort4', recording=rec,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/spikeinterface/sorters/launcher.py", line 297, in run_sorter_by_property
sorting_list = run_sorter_jobs(job_list, engine=engine, engine_kwargs=engine_kwargs, return_output=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/spikeinterface/sorters/launcher.py", line 106, in run_sorter_jobs
sorting = run_sorter(**kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/spikeinterface/sorters/runsorter.py", line 175, in run_sorter
return run_sorter_local(**common_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/spikeinterface/sorters/runsorter.py", line 225, in run_sorter_local
SorterClass.run_from_folder(output_folder, raise_error, verbose)
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/spikeinterface/sorters/basesorter.py", line 293, in run_from_folder
raise SpikeSortingError(
spikeinterface.sorters.utils.misc.SpikeSortingError: Spike sorting error trace:
Traceback (most recent call last):
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/spikeinterface/sorters/basesorter.py", line 258, in run_from_folder
SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose)
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/spikeinterface/sorters/external/kilosort4.py", line 240, in _run_from_folder
st, tF, _, _ = detect_spikes(ops, device, bfile, tic0=tic0, progress_bar=progress_bar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/kilosort/run_kilosort.py", line 398, in detect_spikes
st0, tF, ops = spikedetect.run(ops, bfile, device=device, progress_bar=progress_bar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/kilosort/spikedetect.py", line 188, in run
ops['wPCA'], ops['wTEMP'] = extract_wPCA_wTEMP(
^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/kilosort/spikedetect.py", line 74, in extract_wPCA_wTEMP
model = KMeans(n_clusters=ops['settings']['n_templates'], n_init = 10).fit(clips)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/sklearn/base.py", line 1474, in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py", line 1490, in fit
self._check_params_vs_input(X)
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py", line 1431, in _check_params_vs_input
super()._check_params_vs_input(X, default_n_init=10)
File "/home/wangxy/conda/envs/si_env/lib/python3.11/site-packages/sklearn/cluster/_kmeans.py", line 879, in _check_params_vs_input
raise ValueError(
ValueError: n_samples=5 should be >= n_clusters=6.
Version information:
kilosort4
Context for the issue:
No response
Experiment information:
No response
I have been experiencing the same issue. Does this mean no unit detected?
Please make sure you are using the latest version of Kilosort4, and run it without SpikeInterface. If you still encounter the error, upload kilosort4.log here from the results directory.
Hi, I used most recent version of kilosort4 without SpikeInterface and still get this.
File "C:\Users\hiroy\anaconda3\envs\kilosort\lib\site-packages\IPython\core\interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "
log is here:
06-15 12:36 kilosort.run_kilosort INFO Kilosort version 4.0.12
06-15 12:36 kilosort.run_kilosort INFO Sorting C:\UNITS\672\672-142\shdat3.bin
06-15 12:36 kilosort.run_kilosort INFO ----------------------------------------
06-15 12:36 kilosort.run_kilosort INFO Skipping common average reference.
06-15 12:36 kilosort.run_kilosort INFO Using GPU for PyTorch computations. Specify device to change this.
06-15 12:36 kilosort.run_kilosort DEBUG Initial ops:
{ 'n_chan_bin': 8,
'fs': 32000.0,
'batch_size': 160000,
'nblocks': 1,
'Th_universal': 9,
'Th_learned': 8,
'tmin': 0,
'tmax': inf,
'nt': 65,
'shift': None,
'scale': None,
'artifact_threshold': inf,
'nskip': 25,
'whitening_range': 8,
'binning_depth': 5,
'sig_interp': 20,
'drift_smoothing': [0.5, 0.5, 0.5],
'nt0min': 21,
'dmin': None,
'dminx': 1,
'min_template_size': 200,
'template_sizes': 5,
'nearest_chans': 1,
'nearest_templates': 1,
'max_channel_distance': 1,
'templates_from_data': True,
'n_templates': 6,
'n_pcs': 6,
'Th_single_ch': 6,
'acg_threshold': 0.2,
'ccg_threshold': 0.25,
'cluster_downsampling': 20,
'x_centers': None,
'duplicate_spike_bins': 7,
'filename': WindowsPath('C:/UNITS/672/672-142/shdat3.bin'),
'data_dir': WindowsPath('C:/UNITS/672/672-142'),
'data_dtype': 'float32',
'do_CAR': False,
'invert_sign': False,
'NTbuff': 160130,
'Nchan': 8,
'torch_device': 'cuda',
'save_preprocessed_copy': False,
'chanMap': array([0, 1, 2, 3, 4, 5, 6, 7]),
'xc': array([10., 10., 10., 10., 10., 10., 10., 10.], dtype=float32),
'yc': array([ 0., 20., 40., 60., 80., 100., 120., 140.], dtype=float32),
'kcoords': array([0., 0., 0., 0., 0., 0., 0., 0.]),
'n_chan': 8}
06-15 12:36 kilosort.run_kilosort INFO
06-15 12:36 kilosort.run_kilosort INFO Computing preprocessing variables.
06-15 12:36 kilosort.run_kilosort INFO ----------------------------------------
06-15 12:36 kilosort.run_kilosort INFO Preprocessing filters computed in 0.06s; total 0.06s
06-15 12:36 kilosort.run_kilosort DEBUG hp_filter shape: torch.Size([30122])
06-15 12:36 kilosort.run_kilosort DEBUG whiten_mat shape: torch.Size([8, 8])
06-15 12:36 kilosort.run_kilosort INFO
06-15 12:36 kilosort.run_kilosort INFO Computing drift correction.
06-15 12:36 kilosort.run_kilosort INFO ----------------------------------------
06-15 12:36 kilosort.spikedetect INFO Re-computing universal templates from data.
@hiroyukioya My guess is that because of your large batch size, that step is not selecting enough data to cluster the waveforms for making templates, unless your recording is quite long. Two options for that: you can try reducing the batch size to around the default of 60000 (or maybe 64000, for your sampling rate). You could also use the pre-generated universal templates by setting templates_from_data = False - you would also need to change nt = 61 to make that work.
Another possibility: I noticed your data is float32. Some other users have reported problems that ultimately were happening because their data was on a very different scale from what KS4 expects for standard int16 data. I would recommend you try loading the data in the KS4 GUI to make sure it looks sensible. If it looks blank or washed out, you may need to use the scale parameter to adjust it - try to get the data roughly on the order of -100 to +100 or larger.
If you still encounter problems after trying the above changes please let us know and we can re-open this.