spikeinterface icon indicating copy to clipboard operation
spikeinterface copied to clipboard

TdC2 : clustering params do not get passed properly

Open b-grimaud opened this issue 3 months ago • 3 comments

The params dictionary for TdC2 lists the clustering params under the clustering sub-dictionary.

sorter_params = {
        ...
        "clustering": {
            "recursive_depth": 5,
            "split_radius_um": 40.0,

            "clusterer": "isosplit",
            "clusterer_kwargs": {
                "n_init": 50,
                "min_cluster_size": 10,
                "max_iterations_per_pass": 500,
                "isocut_threshold": 2.0,
            },
            "do_merge": True,
            "merge_kwargs": {
                "similarity_metric": "l1",
                "num_shifts": 4,
                "similarity_thresh": 0.75,
            },
            "min_size_split": 25,
        },
        ...
    }

However, the default params dict of TdcClustering has those parameters listed under the split sub-dictionary.

cluster_params = {
        ...
        "split": {
            "recursive_depth": 3,
            "split_radius_um": 40.0,
            "clusterer": "isosplit",
            "clusterer_kwargs": {
                "n_init": 50,
                "min_cluster_size": 10,
                "max_iterations_per_pass": 500,
                "isocut_threshold": 2.0,
            },
            "do_merge": True,
            "merge_kwargs": {
                "similarity_metric": "l1",
                "num_shifts": 3,
                "similarity_thresh": 0.8,
            },
            "min_size_split": 10,
        },
        ...
    }

When find_cluster_from_peaks is called, the default params of TdcClustering are copied then updated with the clustering and waveforms subdicts of the sorter params, but because the clustering params aren't under the same subdict they get duplicated.

effective_params = {
    ...
    'split': {
        'recursive_depth': 3,
        'split_radius_um': 40.0,
        'clusterer': 'isosplit',
        'clusterer_kwargs': {
            'n_init': 50,
            'min_cluster_size': 10,
            'max_iterations_per_pass': 500,
            'isocut_threshold': 2.0
        },
    },
    ...
    'clustering': {
        'recursive_depth': 5,
        'split_radius_um': 40.0,
        'clusterer': 'isosplit',
        'clusterer_kwargs': {
            'n_init': 50,
            'min_cluster_size': 10,
            'max_iterations_per_pass': 500,
            'isocut_threshold': 2.0
        },
    }
    ...
}

Since TdcClustering then looks for params under split, those manually changed by the user do not get properly passed.

b-grimaud avatar Oct 02 '25 14:10 b-grimaud

@samuelgarcia this is an issue with tdc2.

@b-grimaud we recently forced TDC2 and SC2 to include a version number to help us keep track of what versions have problems. Could you share the sorter version number here too?

zm711 avatar Oct 04 '25 18:10 zm711

Hi @b-grimaud tridesclous2 and spkykingcircus2 are in a process of a big big cleaning.

See this #4140

All parameters are changing to become more consistent. This list you mention is not anymore valid. Even TdcClustering has being renamed iterative_isosplit.

Sorry for all that massive changes but this also should append with many improvements in algos.

You can except some other changes in next following weeks and then a stable relase.

samuelgarcia avatar Oct 06 '25 12:10 samuelgarcia

Noted, thanks for the update !

@zm711 I'm assuming this is what get_sorter_version returns ? For future reference.

b-grimaud avatar Oct 06 '25 14:10 b-grimaud