spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Cluster–channel mismatch in Phy after QC with SpikeInterface

Open wenxin8023 opened this issue 1 month ago • 5 comments

Hi, I’m spike-sorting 128-channel Intan recordings from a flexible probe using Kilosort4 and running quality metrics with SpikeInterface. I generated the probe layout and channel mapping using probeinterface, attached it to the recording, and exported the results to Phy.

However, in Phy I found that the cluster–channel mapping is inconsistent with original mapping(See the screenshots below, using the cluster with the highest firing rate as an example):

Before QC: Image

After QC: Image

I suspect this may be related to the device_channel_indices, channel ordering, or how the probe is attached, but I am not sure what the correct procedure is.

Below is the exact code used for probe construction, channel mapping, attaching the probe, waveform extraction, and exporting:

import spikeinterface.core as si
import spikeinterface.extractors as se
import probeinterface as pi
import numpy as np
from spikeinterface.qualitymetrics import (
    compute_snrs,
    compute_firing_rates,
    compute_isi_violations,
    compute_quality_metrics,
)

recording = se.read_intan('E:/Desktop/flexible electrode/LL NO 170/acute-m03-stratium/acute-m03-striatum-3_251209_152340.rhd', stream_id='0')
sorting = se.read_phy('E:/Desktop/flexible electrode/LL NO 170/acute-m03-striatum/kilosort4')

import probeinterface as pi

num_ch = 128


x = np.zeros(num_ch)
y = 776 + np.arange(num_ch) * (1500 / 128)
positions = np.column_stack((x, y))

ks_map = np.array([
    73,77,74,82,83,81,90,88,102,104,108,110,117,109,118,114,75,79,
    76,80,78,87,89,84,101,107,113,103,115,111,116,112,71,69,72,70,
    86,85,91,92,99,100,105,106,119,121,120,122,67,65,68,66,93,64,
    94,95,98,96,97,127,123,125,124,126,15,13,16,18,23,17,20,24,43,40,
    39,46,47,45,48,50,11,9,12,10,14,19,25,26,37,38,49,44,51,53,52,54,
    7,5,8,6,22,21,27,28,35,36,41,42,55,57,56,58,3,1,4,2,29,0,30,31,
    34,32,33,63,59,61,60,62
])
ks_map_str = np.array([f"A-{i:03d}" for i in ks_map])
recording = recording.select_channels(ks_map_str)

probe = pi.Probe(ndim=2, si_units="um")
probe.set_contacts(
    positions=positions[ks_map],
    shapes="circle",
    shape_params={"radius": 5}
)
probe.set_device_channel_indices(np.arange(128))

recording = recording.set_probe(probe)
print(probe)obe(probe)
print(probe)

recording_signed = si.unsigned_to_signed(recording)
recording_f = si.bandpass_filter(recording_signed, freq_min=300, freq_max=6000)
print(recording_f)
recording_cmr = si.common_reference(recording_f, reference="global", operator="median")
print(recording_cmr)

analyzer = si.create_sorting_analyzer(sorting=sorting, recording=recording_cmr, format="memory")
analyzer.compute("random_spikes", method="uniform", max_spikes_per_unit=600, seed=2205)
analyzer.compute("waveforms", ms_before=1.3, ms_after=2.6, n_jobs=2)
analyzer.compute("templates", operators=["average", "median", "std"])
analyzer.compute("noise_levels")
analyzer.compute("principal_components", n_components=3, mode="by_channel_global", whiten=True)

metrics = compute_quality_metrics(
    analyzer,
    metric_names=["snr", "isi_violation",  "firing_rate", "presence_ratio", "amplitude_cutoff", "d_prime", "isolation_distance", "l_ratio"],
    extension_params = {
        "isi_violation":{"isi_threshold_ms":2.0}
    }
)
from spikeinterface.exporters import export_to_phy
export_to_phy(sorting_analyzer=analyzer, output_folder='E:/Desktop/flexible electrode/LL NO 170/acute-m03-striatum/phy_with_metrics')

Is there anything obviously wrong in my code, especially in how I construct the probe, assign device_channel_indices, or attach the probe to the recording? I’m not sure whether the mismatch in Phy is caused by incorrect channel ordering or by the way I applied the probe mapping.

Any suggestions would be greatly appreciated!

wenxin8023 avatar Dec 11 '25 06:12 wenxin8023

Hi @wenxin8023 , yeah there might be an issue when you make the probe. Luckily, kilosort saves all the probe information you need in either a channel_positions.npy file or a probe.prob file (depending on version).

If you have one of these files, you can use

from probeinterface import read_prb
probegroup = read_prb(phy_path / "probe.prb")
probe = probegroup.probes[0]

or

from probeinterface import Probe
import numpy as np

probe = Probe(si_units="um")
channel_positions = np.load(phy_path / "channel_positions.npy")
probe.set_contacts(channel_positions)
probe.set_device_channel_indices(range(probe.get_contact_count()))

this should ensure that you're using the exact same probe that kilosort used.

Also note that Phy is no longer actively maintained. I'm biased so you don't have to trust me, but I'd recommend trying to visualise your analyzer using spikeinterface-gui https://github.com/SpikeInterface/spikeinterface-gui :)

chrishalcrow avatar Dec 11 '25 10:12 chrishalcrow

Thank you very much for your reply! By the way, could you please take a look at the rest of my code to see if there are any potential issues?

Specifically, I would like to confirm whether it is correct to apply filtering and CMR (common median referencing) before computing the quality metrics. I want to make sure that these preprocessing steps will not affect the accuracy of the metrics calculation.

Thanks again for your help!

wenxin8023 avatar Dec 11 '25 11:12 wenxin8023

Yes, you should definitely apply the preprocessing before computing quality metrics.

I would set your n_jobs globally so that you can use multiprocessing for a load of computations (note: there are known issues with windows+data-on-a-server+multiprocessing, so there's small chance it'll crash depending on your situtation... if it does, revert back). Just stick this near the top of code:

si.set_global_job_kwargs(n_jobs=8)

I have a 20-core machine, and usually use 8 cores. You can also combine your compute code into one simple command, including quality metrics. I like this, but some others don't:

analyzer.compute({
    "random_spikes": {
        "method": "uniform", 
        "max_spikes_per_unit": 600,
        "seed": 2205,
    },
    "waveforms": {
        "ms_before": 1.3,
        "ms_after": 2.6,
    },
...

You should probably also compute spike amplitudes and locations

analyzer.compute({
    "spike_amplitudes": {},
    "spike_locations": {},
})

Also if you don't specify which quality metrics to compute, spikeinterface will just compute everything it can. I would do this! And also compute some template metrics while you're at it

analyzer.compute({
    "quality_metrics": {},
    "template_metrics": {"include_multi_channel_metrics": True}
})

chrishalcrow avatar Dec 11 '25 11:12 chrishalcrow

Thanks again for your help !!!

I have one more question: after exporting the SpikeInterface-computed quality metrics into Phy, I noticed that the original Intan channel indices and the KSLabel are no longer present in the Phy output (ClusterView). Is there a recommended way to add these properties back so that they appear correctly in Phy?

Thanks!

wenxin8023 avatar Dec 11 '25 11:12 wenxin8023

You can add back the KSLabel (or any other property available) with this additional argument: additional_properties=["KSLabel"].

What column do you refer to as "original Intan channel indices"?

alejoe91 avatar Dec 15 '25 14:12 alejoe91

Hi @wenxin8023 , yeah there might be an issue when you make the probe. Luckily, kilosort saves all the probe information you need in either a channel_positions.npy file or a probe.prob file (depending on version).

If you have one of these files, you can use

from probeinterface import read_prb probegroup = read_prb(phy_path / "probe.prb") probe = probegroup.probes[0] or

from probeinterface import Probe import numpy as np

probe = Probe(si_units="um") channel_positions = np.load(phy_path / "channel_positions.npy") probe.set_contacts(channel_positions) probe.set_device_channel_indices(range(probe.get_contact_count())) this should ensure that you're using the exact same probe that kilosort used.

Also note that Phy is no longer actively maintained. I'm biased so you don't have to trust me, but I'd recommend trying to visualise your analyzer using spikeinterface-gui https://github.com/SpikeInterface/spikeinterface-gui :)

Hi! I want to endorse this as I am moving away from Phy to spikeinterface-gui. It is easy to use Phy but somehow it is slow in my computer if I have a lot of spikes in the dataset. spikeinterface-gui does not have this problem. Plus, there is a new Youtube tutorial, so the transition is not that hard. P.S. my full transition is probably going to be after SortingAnalyzer can directly read output from Kilosort4 standalone <#4202> because somehow in my computer, Kilosort4 standalone sorts things out much faster than Kilosort4 via Spikeinterface

chiyu1203 avatar Dec 17 '25 19:12 chiyu1203

P.S. my full transition is probably going to be after SortingAnalyzer can directly read output from Kilosort4 standalone <#4202> because somehow in my computer, Kilosort4 standalone sorts things out much faster than Kilosort4 via Spikeinterface

Hi @chiyu1203 glad you're enjoying si-gui! Another pipeline option is to sort using kilosort standalone, then start from reading in the sorting folder. A simple pipeline using this option would look something like:

# do kilosort seperately
import spikeinterface.full as si

recoring = si.read_openephys('path/to/recording') # or however you read the recording
sorting = si.read_kilosort('path/to/kilosort/output/')

analyzer = si.create_sorting_analyzer(sorting, si.common_reference(si.bandpass_filter(recording)))
analyzer.compute([
    'random_spikes',
    'templates',
    # whatever else you'd like to compute
])

Then you can compute anything you'd like with spikeinterface for postprocessing, but still use kilosort on its own independently.

chrishalcrow avatar Dec 18 '25 10:12 chrishalcrow

@chrishalcrow thanks for reminding me of the read_kilosort function. It's a bit off topic but let me ask a quick question. I remember there was a discussion about how to do preprocessing and postprocessing on Spikeinterface (with a wrapper for kilosort4 or Kilosort4 standalone). And based on what you suggested today, If I want to use Kilosort 4 standalone and only do postprocessing, including curation and data visualisation with the Sorting Analyser, the method sound like a combination of
the option 2

Let kilosort do the preprocessing and motion correction, then try and imitate these using spikeinterface. This is a bit of a hack, but works pretty well in my experience!

and the option 3

if you're like me and don't like the whitened recording for your postprocessing, you can compute your motion correction and apply it to a non-whitened recording.

in your previous post.

The biggest difference seems to be using whitened or unwhitened recording when creating the sorting analyser for the postprocessing. If I am going to use whitening then this code makes sense. motion = si.compute_motion(si.whiten(si.bandpass_filter(recording)), preset="kilosort_like")

However, @JoeZiminski made a comment on another pre-processing pipeline, which is meant to do all preprocessing on spikeinterface, and then call Kilosort with car=False, skip_kilosort_preprocessing=True, in the same discussion thread.

He wrote:

you may also want to run a drift-correction step prior to whitening.

I was confused. And because I am recording in non-model insects and do not have a benchmark testing of which pipeline works the best, I took a break from trying out different methods.

Any thought about "optimal pre-processing and post-processing steps" or, "what would be the best way to evaluate which methods works the best for each lab"?

chiyu1203 avatar Dec 18 '25 19:12 chiyu1203