pyannote-audio How diarization assign speakers in overlapped sppech?

How diarization assign speakers in overlapped sppech?

Open YuGuiwe opened this issue 2 years ago • 4 comments

How does speaker diarization assign speakers in overlapped speech? I saw it was masked in pyannote.audio.pipelines.speaker_diarization line 294. I also question the relation between speech segmentation, resegmentation, and speaker diarization and their purpose.

Jun 13 '22 06:06 YuGuiwe

I recommend you read this paper, which is an online variant of what is currently implemented in develop branch.

Jun 13 '22 07:06 hbredin

Thanks for your reply! This paper I already read. But assign speaker method seems different from this. I couldn't understand why the cluster method in your diarization didn't calculate embedding in overlapping speech intervals, but it can still predict the correct prediction. Or did I miss something after you masked the overlapping speech intervals?

Jun 13 '22 07:06 YuGuiwe

Overlapping frames are only masked for computing the embeddings, not for the rest of the pipeline. Anyway, I plan to write a technical report describing the approach and will share it once it is ready.

Jun 13 '22 08:06 hbredin

Thank you again. I really appreciate it! So assigning speakers in overlapping didn't rely on embedding but on other mechanisms. Where can I find the code?

Because my data has many speakers speaking in a short time (about 1 second), and most overlap with others, if not fine-tuned, those are often unpredictable. Before fine-tuned, I want to understand these mechanisms fully.

Jun 13 '22 08:06 YuGuiwe

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Aug 12 '22 18:08 stale[bot]

pyannote-audio pyannote-audio copied to clipboard

How diarization assign speakers in overlapped sppech?

pyannote-audio
pyannote-audio copied to clipboard