whisper.cpp
whisper.cpp copied to clipboard

Published 20 hours ago •

Reame
Issues

Diarization

Open ggerganov opened this issue 2 years ago • 0 comments

Some unsuccessful experiments with audio embedding clustering

Tried to apply C-means fuzzy clustering on:

embeddings after the initial convolution in the encoder
self KV embeddings from each encoder layer
KQV embeddings from each encoder layer
embeddings from the last encoder layer
cross KV embeddings of each decoder layer

Instead of clustering the full embedding dimensions, first reduce dimensionality using SVD:

decompose the embeddings E = USV
compute singular vectors U' = US
project E on U' and take the top few coordinates

Nov 08 '22 18:11 ggerganov