spikeinterface Phy & unit summary plots show widely apart channels belonging to the same unit.

Hi Everyone

I’m spike sorting neuropixels 1.0 data via spikeinterface, with spykingcurcus & tridesclous exporting the results to phy for visualization & manual curation. I have strange views on phy & spikeinterface created unit summary plots.

some important information at the beginning : I added sync channel to continuous data : my recording has 385 chanels. I generateed the json file of probe layout from open ephys : this probe file has 384 channels. Then :
recording_plus_probe = recording.set_probegroup(probe) This operation ran with no error ! attaching a recording & probe with different channel numbers. I'm not sure if the following result is insidiously a result of that.

In the attached unit_summary plots (performed via spikeinterface on sorting performed by spykingcircus), the highlighted electrode contacts at the left-side plot are 4mm apart, 1 at the middle of the probe (y=3.9mm) & the other at the tip. In phy there is a correspondingly duplicate (bottom_top) channel locations belonging to the same unit in probe view & trace view.

It seems to me that spike_sorters get correct information from device channel ids & geometrical location of channels. But at the time of visualization, Spikeinterface maps the channel with the extremum amplitude to the adjacent channels within a radius according to the conventional channel numbers (not device channel ids), which is not geometrically corresponding.

Would the probe layout itself be the problem ? even with a wrong probe layout, channels more than a certain radius (for example 100_microns) should not be sorted together. Here both unit summary plot & phy depict channels 4mm apart belonoging to the same unit ! I created the probe layout via json files output from the open_ephys GUI.

Left_right duplication is probably related to channel groupings. Though less of an issue, I would also like to know why it’s splitted bilaterally. Alessio pointed out in another issue that inappropriate selction of group_mode='by_probe' or ‘by_shank’ may cause problems. I don’t know if this would also be related in my case. But I will further test this.

P.S. : I record in primate brain & therefore select all the channels linearly along about 7mm of the probe with a small overlap at the middle. In mouse recordings which such a long range is not needed, channels are selected usually in parallel concentrated at the tip of the probe. In such a case (mouse layout), the above issue would seemingly be less visible !!

Screenshot at 2022-07-21 16-02-08 tdc

Jul 21 '22 19:07 Aryo-Zare

I did some searching. If I understood correctly the following line defines the channel indices later for sparisty, which then is probably used for showing them around the extremum channel :

https://github.com/SpikeInterface/spikeinterface/blob/7aac1053962bd4dac174d10b5457d48fda511469/spikeinterface/postprocessing/template_tools.py#L169 But is the output of this line the geomterically adjacent channels (device channel indices), or the channel numbers around the extremum channel (absolute channel index) ?

I guess the problem is not from the sorters, since I tried 2 different sorters with similar results. I'm not sure if the problem is from my side or Spikeinterface.

Jul 26 '22 15:07 Aryo-Zare

Hi @Aryo-Zare

Sorry for my late reply. Would it be possible to share the code that you are using? Maybe we'd also need to look at the raw+sorted data to reproduce the issue on our side!

In general, throughout the code we use 'ind' or 'index' to refer to the index of the channel, and 'id' for the actual channel id. In the line you highlighted so we are referring to the indices. I'll double.check if everything looks correct in the report summary

Jul 27 '22 11:07 alejoe91

Thanks Alessio for your feedback. no problem we're all super busy.

By 'actual channel id' do you mean physical location of the channel on the probe ? And by 'index of the channel'('ind' or 'index' ) do you mean the channel number in the output recording ?

If this is the case in the above line should it not be the physical id of the channel ? And then these would be possibly later converted to the output recording channel index for plotting purposes ? More clearly, instead of distances[chan_ind, :] being distances[chan_id, :] , & then later converting the id to index(ind) to use it for plotting.

I attached the pipeline I'm using to analyze the data (in an MS word format since I use Spyder & copy pasted the commands). issue_pipe .docx

The probe layout jason file as well as raw & sorting data are in the cloud link below : https://ncloud.lin-magdeburg.de/s/oyAf7w56E7y7gnk Please let me know if you have problem opening it.

The names of the raw & sorted data are as written in the pipeline. Sorry if it may not be the same as your conventions. The raw data folder is named '2022-03-23_10-51-07'

P.S. The xml file in the raw recorded data is from an older version of open_ephys and can not be used to get the probe layout. Hence I use the jason file.

Jul 27 '22 18:07 Aryo-Zare

@Aryo-Zare maybe we could have a zoom call to discuss about this together. Can you send me an email?

Aug 02 '22 08:08 alejoe91

@Aryo-Zare I have a thought. Does this happen for every unit of just for a subset? By default, Phy uses a best_channels sparsity: the template is computed and the N channels with the largest amplitude are selected. The unit summary, instead, uses a radius approach.

So if you have to units which are far apart, but whose spike trains are synchronized, the template could have 2 "peaky" areas and these will be included as best_channels.

Does it make sense?

You can force a radius approach for Phy as well by adding the argument: sparsity_dict=dict(method="radius", radius_um=100) to the export_to_phy() function.

Aug 02 '22 09:08 alejoe91

Hi @alejoe91

Regarding the units with related far apart channels this happens in all units. This is also somewhat visible in the trace view in phy in which all extracted units are color coded and are all duplicated in a proximal-distal (respecting the probe) manner. Aside from phy, the heatmap or colored active electrode contact points in unit summay plots ( as I attached the figure above), also show widely apart (proximal-distal) channels for example at 0.5mm and 4.5mm along the length of the probe. Hence I think it's unrelated to phy.

In my probe layout, even numbered channels are physically adjacent (at the distal part of the probe), and odd numbered channels are also adjacent but at the proximal section of the probe.

My explanation is that mapping of channels is based on absolute channel index instead of device channel indices (corresponding to the physical location of the channels). Hence such a clean-cut proximal-distal duplication of channels both in unit summary plot and phy.

I'll send you my probe layout plot probably tomorrow (I don't have access to the file now). And I'm interested in your suggestion of having a zoom discussion on this. My email is [email protected] I would be available next week. Please let me know your preferred time spots.

Aug 02 '22 14:08 Aryo-Zare

Hi @alejoe91

Sorry for my delayed feedback. Please find attached 2 pdf files depicting the distal & proximal parts of the probe layout I use. distal .pdf proximal .pdf Both of them show both the channel index & device channel indices. As shown, device channel indices are even_numbered or odd numbered depending if they are located at the distal or proximal end of the probe.

It seems to me that spikeinterfae incorporates channel indices (instead of device channel indices). For examle : device channel indices 1,2,3,4,5,6 correspond to channel indices : 192 , 1 , 193 , 2 , 194 , 3 The latter is then shown in uni summary plots and phy ! Hence, a distal_proximal duplication.

Aug 08 '22 08:08 Aryo-Zare

spikeinterface spikeinterface copied to clipboard

Phy & unit summary plots show widely apart channels belonging to the same unit.

spikeinterface
spikeinterface copied to clipboard