spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

Alter `nn_isolation` to use existing principal components

Open chrishalcrow opened this issue 1 year ago • 3 comments

Updated the nn_isolation metric to use the already-calculated PCs.

The previous implementation of the nn_isolation metric does the following:

  1. Take two units, A and B, and their spikes and waveforms
  2. Using the waveforms of just the spikes from units A and B, run a principal component analysis
  3. Compute the isolation score based on the PCs of this analysis

This is prohibitively slow; meaning that this quality metric is currently removed from the default quality_metric list due to its speed.

Instead, this PR implements the following:

  1. Take the PCs calculated by the compute_principal_components, which includes all spikes
  2. Compute the isolation score based on the PCs of this analysis

I think the new implementation is consistent with the references describing the quality metrics (https://pubmed.ncbi.nlm.nih.gov/28910621 and https://pubmed.ncbi.nlm.nih.gov/33473216/ ). Please correct me if you disagree!

It’s also:

  • Much faster (x150 on single core), since we don’t need to redo the PCA
  • Fits into the parallelisation scheme used by the other pca metrics => more than 150 faster!
  • Uses the sparsity scheme already used by the other PC metrics, rather than a custom one => easier to maintain.

The isolation score are generally worse in the new implementation, because the PCA is applied to all spikes, not just those in the two clusters being compared.

Also updated docstrings and docs to (hopefully) clarify what the metric is calculating.

Benchmark code. Note: num_units is the most important parameter since there is a max_spikes limit (so a long duration doesn’t affect things) and the method uses sparsity (so num_channels doesn’t affect things)

import spikeinterface.full as si
import numpy as np
from time import perf_counter

all_times = {}

for num_units in [10,20,30,40]:
    
    recording, sorting = si.generate_ground_truth_recording(durations=[10], num_channels=10, num_units=num_units)
    sorting_analyzer = si.create_sorting_analyzer(recording=recording, sorting=sorting)
    
    sorting_analyzer.compute(["random_spikes", "noise_levels", "waveforms", "templates", "principal_components", "spike_locations", "spike_amplitudes"])
    
    times = []
    for _ in range(3):
        t_start = perf_counter() 
        sorting_analyzer.compute({"quality_metrics": {"metric_names": ["nn_isolation"]}})
        t_end = perf_counter() 
        times.append(t_end - t_start)
    
    time = np.median(times)
    
    all_times[num_units] =  time

Old times:

all_times
>>> {10: 33.851581208989955, 20: 141.90579045796767, 30: 380.2453615410486, 40: 620.6475296249846}

New times:

all_times
>>> {10: 0.2444504169980064, 20: 0.968019749969244, 30: 2.3160873339511454, 40: 4.065173499984667}

chrishalcrow avatar Aug 27 '24 15:08 chrishalcrow

I think the only real issue is that the values are now different. So from my perspective it would make the most sense to wait until your other PR is merged allowing for computation of individual metrics without redoing all quality metrics so that if a user needs to go back and change.

Would we want a warning for at least a version or two saying implementation has changed so nn is not directly comparable with versions < (0.101.x)?

zm711 avatar Aug 29 '24 12:08 zm711

I think the only real issue is that the values are now different. So from my perspective it would make the most sense to wait until your other PR is merged allowing for computation of individual metrics without redoing all quality metrics so that if a user needs to go back and change.

Yeah, that makes sense. Note: a bug (fixed in #3138) meant nn_isolation was impossible to use in 0.101.0. It has also been excluded from the default list of quality metrics for a long time because it took so long to run. So I'd expect there to be very few cases where someone had actually calculated it and is using it. Would love to hear if people have calculated it though!

Would we want a warning for at least a version or two saying implementation has changed so nn is not directly comparable with versions < (0.101.x)?

Could be a good idea. Also would be good to make a quality metrics policy in the future. Do we include lots of implementations, or just "the best" implementation? A good example is isi_violations: https://spikeinterface.readthedocs.io/en/latest/modules/qualitymetrics/isi_violations.html I think the Llobet approximation has been shown to be more accurate and is the same speed as the UMS2000 one. Should we keep both metrics, or make a policy to decide on which to keep? And if we find errors in some quality metrics and fix it, do we warn about the change for a version?

We could be generous and allow for everything. Then I could call this metric nn_isolation_from_pcs and keep both implementations. We do something like this with the silhouette score: we maintain a precise slow version and a fast approximation of it https://spikeinterface.readthedocs.io/en/latest/modules/qualitymetrics/silhouette_score.html.

chrishalcrow avatar Aug 29 '24 13:08 chrishalcrow

I wrote the silhouette score code and added in the approximation (from Hrushka) because my computer didn't have enough RAM to do the original implementation. Haha.

I think my issue is that if someone (maybe not for nn but for other metrics) was using some cutoff and then we change the implementation and now their values are all lower now their cutoff for analyzing their data no longer works and they have to redo the process of figuring out what a new cutoff is. But yeah I'm not sure the best solution whether it be a warning or something. So I think this is a good topic of discussion for the meeting!

zm711 avatar Aug 29 '24 13:08 zm711

@chrishalcrow I think we should ask @magland whether the local PCA using A and B is necessary. I think this is not clear from the paper!

alejoe91 avatar Feb 05 '25 15:02 alejoe91

@chrishalcrow I think we should ask @magland whether the local PCA using A and B is necessary. I think this is not clear from the paper!

IMO, in principle it's best to do PCA on each pair, but I understand this can be computationally expensive. IDK what's the best course of action. Maybe do it on each pair, but also decrease the number of features?

magland avatar Feb 05 '25 15:02 magland

Hello, I'm gonna close this - if we were to add this metric we should add as a separate metric. But before adding it, should show that it is a useful thing to calculate. So for now, let's leave it.

chrishalcrow avatar Jun 23 '25 10:06 chrishalcrow