Kilosort Sorting channel id to a cluster

Describe the issue:

Hey, I'm using kilosort4 , and I was wondering how to generate a Cluster table where I can map the clusters to a specific channel since I could not find a file that contains this type of data sorting.

Thank in advance

Jun 04 '24 13:06 ghattasb

Hello,

You can see example code for this here, in the "Plot the results" section: https://kilosort.readthedocs.io/en/latest/tutorials/basic_example.html

import numpy as np
import pandas as pd
from pathlib import Path

results_dir = Path(settings['data_dir']).joinpath('kilosort4')
ops = np.load(results_dir / 'ops.npy', allow_pickle=True).item()
camps = pd.read_csv(results_dir / 'cluster_Amplitude.tsv', sep='\t')['Amplitude'].values
contam_pct = pd.read_csv(results_dir / 'cluster_ContamPct.tsv', sep='\t')['ContamPct'].values
chan_map =  np.load(results_dir / 'channel_map.npy')
templates =  np.load(results_dir / 'templates.npy')
chan_best = (templates**2).sum(axis=1).argmax(axis=-1)
chan_best = chan_map[chan_best]
amplitudes = np.load(results_dir / 'amplitudes.npy')
st = np.load(results_dir / 'spike_times.npy')
clu = np.load(results_dir / 'spike_clusters.npy')
firing_rates = np.unique(clu, return_counts=True)[1] * 30000 / st.max()
dshift = ops['dshift']

Specifically, the size of chan_best is equal to the number of clusters, and each value is the channel with the highest amplitude for the matched template. I.e. chan_best[0] would be the highest amplitude channel for cluster id 0.

Jun 04 '24 20:06 jacobpennington

@jacobpennington Hey Jacob, thank you for your reply! the code you provided doesn't return what we initially wanted - as Ghattas mention above, we want to generate a table that will include all the relevant information for a specific cluster. Specifically we want to match each cluster to its channel. we are thinking about a table that will look like the "ClusterView" on the Phy's TemplateGUI (we couldn't find an Export button in the GUI).

Do you have any other suggestions for us? Thanks again for your reply! :)

Jun 17 '24 13:06 LiorDor1

Hello @LiorDor1

Most of this is already done in that snippet I shared, you just have to combine the variables into a table if that's what you want. This will do the columns shown in your screenshot, for example (except sh, I'm not familiar with that one):

from pathlib import Path

import numpy as np
import pandas as pd

from kilosort.io import load_ops


results_dir = Path('C:/users/jacob/.kilosort/.test_data/kilosort4')
ops = load_ops(results_dir / 'ops.npy')

fs = ops['fs']
chan_map =  np.load(results_dir / 'channel_map.npy')
templates =  np.load(results_dir / 'templates.npy')
chan_best = (templates**2).sum(axis=1).argmax(axis=-1)
chan_best = chan_map[chan_best]
template_amplitudes = ((templates**2).sum(axis=(-2,-1))**0.5)
st = np.load(results_dir / 'spike_times.npy')
clu = np.load(results_dir / 'spike_clusters.npy')
pos = np.load(results_dir / 'spike_positions.npy')

cluster_ids = np.unique(clu)
spike_counts = np.unique(clu, return_counts=True)[1]
firing_rates = spike_counts * fs / st.max()

depth = np.empty_like(cluster_ids)
for i in cluster_ids:
    spike_mask = (clu == i)
    depth[i] = pos[spike_mask,1].mean()

pd.DataFrame.from_dict({
    'cluster': cluster_ids, 'chan': chan_best, 'depth': depth, 'fr': firing_rates,
    'amp': template_amplitudes, 'n_spikes': spike_counts
    }).set_index('cluster')

Most variables are organized by either cluster id or spikes. If it's the former, you shouldn't have to do much. For the latter, you can add steps to the loop for depth.

Jun 17 '24 22:06 jacobpennington