omi icon indicating copy to clipboard operation
omi copied to clipboard

Train Own Voice for "Speaker 0" Naming

Open ThatGuySizemore opened this issue 1 year ago • 3 comments

As a user, I want to be able to train Deepgram's transcription functions to identify my own voice so that anything I speak is labeled with my name and summaries are able to identify what I said compared to other people.

ThatGuySizemore avatar May 17 '24 16:05 ThatGuySizemore

Thanks for the feedback @ThatGuySizemore. Does deepgram offer this functionality right now? I'd love to see a proof of concept of what it might look like to train and use a customized model.

after-ephemera avatar May 18 '24 13:05 after-ephemera

@after-ephemera - Looks like Deepgram's Diarization doesn't support it, unlike Whisper. Wonder if there is a different solution to identify "self" in the transcriptions. The summaries require some decoding a bit to understand what precisely happened. For example, when someone chats with memories, saying "Are there any tasks I agreed to do today?" it typically won't work due to a lack of identification. Rarely, though, Deepgram can use context clues in a conversation to identify "self" vs. other people. Granted, it's hit or miss.

ThatGuySizemore avatar May 21 '24 16:05 ThatGuySizemore

@after-ephemera I'm doing some additional digging and working with some of the devs over at Deepgram to understand the API better.

ThatGuySizemore avatar May 21 '24 16:05 ThatGuySizemore

Hey @ThatGuySizemore thanks for pointing this out, actually deepgram doesn't work well with this, we are trying to build this on our own, will keep you updated :)

josancamon19 avatar Jun 03 '24 05:06 josancamon19

is this not now implemented?

bbookman avatar Jun 06 '24 19:06 bbookman