cdp-backend icon indicating copy to clipboard operation
cdp-backend copied to clipboard

Speaker classification

Open isaacna opened this issue 3 years ago • 18 comments

Feature Description

Backend issue for the relevant roadmap issue

Adding speaker classification to CDP transcripts. This could be through a script/class that retroactively attaches the speaker name to a transcript that already has speaker diarization enabled. Prodigy can be used for annotating the training data.

Use Case

With speaker classification we can provide transcripts annotated with the speaker. This can be used in many ways such as through a script or github action

Solution

Very high level idea would be to:

  • Use GCP's built-in speaker diarization to separate the speakers. We could also create our own audio classification model. We could also use something like Prodigy to annotate the data, but I believe they have their own diarization/transcription models as well.
  • Figure out how to add the classified speaker names to the diarized transcript. I'm not sure if GCP allows you to provide any training data, but from what I could tell they only separate the speakers, but the models don't take in training data to label them.

A bigger picture breakdown of all the major components can be found on the roadmap issue under "Major components".

isaacna avatar Nov 05 '21 01:11 isaacna