diart
diart copied to clipboard
Optionally relax incremental clustering constraints
Problem
Cannot-link constraints are currently hard-coded in OnlineSpeakerClustering
. If a segmentation model over-segments speakers, it may be better to rely on speaker embeddings instead to determine the identity of a speaker turn.
cc: @hbredin
Idea
Implement a different optimal mapping strategy in SpeakerMap
that replaces LSAP (hungarian algorithm) with a simple argmax/argmin.
Example
A quick implementation could take advantage of the existing MappingMatrixObjective
that's already detached from SpeakerMap
.
import numpy as np
from diart.mapping import MappingMatrixObjective, SpeakerMap
class RelaxedMinimizationObjective(MappingMatrixObjective):
def optimal_assignments(self, matrix: np.ndarray) -> List[int]:
return list(np.argmin(matrix, axis=1))
relaxed_mapping = SpeakerMap(cost_matrix, RelaxedMinimizationObjective())
I would also try to keep cannot-link constraints for overlapping speakers only (and allow merging non-overlapping speakers).
The implementation might be trickier, though.
That is a good idea. I think that would require major changes in SpeakerMap
because right now it doesn't have a way of knowing who overlaps who. Or maybe it can be implemented in the MappingMatrixObjective
subclass.