diart icon indicating copy to clipboard operation
diart copied to clipboard

Optionally relax incremental clustering constraints

Open juanmc2005 opened this issue 2 years ago • 2 comments

Problem

Cannot-link constraints are currently hard-coded in OnlineSpeakerClustering. If a segmentation model over-segments speakers, it may be better to rely on speaker embeddings instead to determine the identity of a speaker turn.

cc: @hbredin

Idea

Implement a different optimal mapping strategy in SpeakerMap that replaces LSAP (hungarian algorithm) with a simple argmax/argmin.

Example

A quick implementation could take advantage of the existing MappingMatrixObjective that's already detached from SpeakerMap.

import numpy as np
from diart.mapping import MappingMatrixObjective, SpeakerMap

class RelaxedMinimizationObjective(MappingMatrixObjective):
    def optimal_assignments(self, matrix: np.ndarray) -> List[int]:
        return list(np.argmin(matrix, axis=1))

relaxed_mapping = SpeakerMap(cost_matrix, RelaxedMinimizationObjective())

juanmc2005 avatar May 31 '22 17:05 juanmc2005

I would also try to keep cannot-link constraints for overlapping speakers only (and allow merging non-overlapping speakers).

The implementation might be trickier, though.

hbredin avatar Jun 01 '22 07:06 hbredin

That is a good idea. I think that would require major changes in SpeakerMap because right now it doesn't have a way of knowing who overlaps who. Or maybe it can be implemented in the MappingMatrixObjective subclass.

juanmc2005 avatar Jun 01 '22 08:06 juanmc2005