ecephys_spike_sorting icon indicating copy to clipboard operation
ecephys_spike_sorting copied to clipboard

PCmetrics not being calculated for some clusters

Open JRicardo24 opened this issue 1 year ago • 5 comments

Hi guys, I was checking out the function on the metrics module that calculates the Principal Component related metrics, from which the next printscreen was taken: Github_askJosh The condition in line 286 seems to be working just fine, as I tested on my own dataset and it doesn't calculate the metrics for clusters with 20 spikes or less (also tested for 100 spikes threshold). However, I do have some other clusters in my dataset with more than 20 spikes, including one with 30k ish spikes, for which metrics are not being calculated (its row on the DataFrame is filled with NaN values). I was wondering if this has to do with the conditions on lines 284 or 285, since I haven't quite fully understood what they accomplish. Any help is appreciated :) @jsiegle

JRicardo24 avatar Sep 06 '22 02:09 JRicardo24

Just to clarify – have you done any manual curation on this dataset? If not, then do you know which of the four conditions (all_pcs.shape[0] > 10, not (all_labels == cluster_id).all(), etc.) is causing it to skip the calculation for the units with lots of spikes?

jsiegle avatar Sep 07 '22 20:09 jsiegle

Sorry for the delay @jsiegle . No manual curation has been done on the dataset. After some tests I found out that the condition that is causing it to skip the calculation for units with lots of spikes, like the cluster we have with 30917 spikes, is the (sum(all_labels == cluster_id) > 100). That is, in this case it is cluster 240 but when I print it's (sum(all_labels == cluster_id)) the result is 0. I trully don't know why the all_labels for this cluster does not contain 240 in it's elements, but that is what is preventing the calculation of the PC_metrics.

JRicardo24 avatar Oct 17 '22 13:10 JRicardo24

Can you print the values in relative_counts for this cluster? It's possible that the count scaling is causing there to be zero PCs included in the calculation.

jsiegle avatar Oct 20 '22 18:10 jsiegle

Git_resposta This is the relative_counts for cluster 240. The value printed below, 41, it's just the length. This test was made with a max_spikes_for_unit value of 2000. Here's more info that might be helpful, @jsiegle : aditional_info

JRicardo24 avatar Oct 21 '22 01:10 JRicardo24

I'm not sure what could be causing this. Let me know if you're able to gain any more insight into the problem.

jsiegle avatar Oct 31 '22 20:10 jsiegle