Sorting channel id to a cluster
Describe the issue:
Hey, I'm using kilosort4 , and I was wondering how to generate a Cluster table where I can map the clusters to a specific channel since I could not find a file that contains this type of data sorting.
Thank in advance
Hello,
You can see example code for this here, in the "Plot the results" section: https://kilosort.readthedocs.io/en/latest/tutorials/basic_example.html
import numpy as np
import pandas as pd
from pathlib import Path
results_dir = Path(settings['data_dir']).joinpath('kilosort4')
ops = np.load(results_dir / 'ops.npy', allow_pickle=True).item()
camps = pd.read_csv(results_dir / 'cluster_Amplitude.tsv', sep='\t')['Amplitude'].values
contam_pct = pd.read_csv(results_dir / 'cluster_ContamPct.tsv', sep='\t')['ContamPct'].values
chan_map = np.load(results_dir / 'channel_map.npy')
templates = np.load(results_dir / 'templates.npy')
chan_best = (templates**2).sum(axis=1).argmax(axis=-1)
chan_best = chan_map[chan_best]
amplitudes = np.load(results_dir / 'amplitudes.npy')
st = np.load(results_dir / 'spike_times.npy')
clu = np.load(results_dir / 'spike_clusters.npy')
firing_rates = np.unique(clu, return_counts=True)[1] * 30000 / st.max()
dshift = ops['dshift']
Specifically, the size of chan_best is equal to the number of clusters, and each value is the channel with the highest amplitude for the matched template. I.e. chan_best[0] would be the highest amplitude channel for cluster id 0.
@jacobpennington
Hey Jacob, thank you for your reply!
the code you provided doesn't return what we initially wanted - as Ghattas mention above, we want to generate a table that will include all the relevant information for a specific cluster. Specifically we want to match each cluster to its channel. we are thinking about a table that will look like the "ClusterView" on the Phy's TemplateGUI (we couldn't find an Export button in the GUI).
Do you have any other suggestions for us? Thanks again for your reply! :)
Hello @LiorDor1
Most of this is already done in that snippet I shared, you just have to combine the variables into a table if that's what you want. This will do the columns shown in your screenshot, for example (except sh, I'm not familiar with that one):
from pathlib import Path
import numpy as np
import pandas as pd
from kilosort.io import load_ops
results_dir = Path('C:/users/jacob/.kilosort/.test_data/kilosort4')
ops = load_ops(results_dir / 'ops.npy')
fs = ops['fs']
chan_map = np.load(results_dir / 'channel_map.npy')
templates = np.load(results_dir / 'templates.npy')
chan_best = (templates**2).sum(axis=1).argmax(axis=-1)
chan_best = chan_map[chan_best]
template_amplitudes = ((templates**2).sum(axis=(-2,-1))**0.5)
st = np.load(results_dir / 'spike_times.npy')
clu = np.load(results_dir / 'spike_clusters.npy')
pos = np.load(results_dir / 'spike_positions.npy')
cluster_ids = np.unique(clu)
spike_counts = np.unique(clu, return_counts=True)[1]
firing_rates = spike_counts * fs / st.max()
depth = np.empty_like(cluster_ids)
for i in cluster_ids:
spike_mask = (clu == i)
depth[i] = pos[spike_mask,1].mean()
pd.DataFrame.from_dict({
'cluster': cluster_ids, 'chan': chan_best, 'depth': depth, 'fr': firing_rates,
'amp': template_amplitudes, 'n_spikes': spike_counts
}).set_index('cluster')
Most variables are organized by either cluster id or spikes. If it's the former, you shouldn't have to do much. For the latter, you can add steps to the loop for depth.