Resemblyzer icon indicating copy to clipboard operation
Resemblyzer copied to clipboard

d-Vectors for UIS-RNN

Open arthavmane opened this issue 5 years ago • 4 comments

I'm working on a project in which I want to use d-vector embeddings to train a model. Can someone please help how to compute d-vectors for different utterances from different speakers to pass into the UISRNN model?

arthavmane avatar May 06 '20 09:05 arthavmane

Hi @arthavmane , Did you find a way to get the d-vectors as I am working on a similar project?

zhs105 avatar Sep 22 '20 07:09 zhs105

I haven't tried UIS-RNN yet and only found this library yesterday but I can extract the embeds with _, EMBEDS, wav_splits = encoder.embed_utterance(wav, return_partials=True)

davide-scalzo avatar Oct 07 '20 13:10 davide-scalzo

@davodesign84 actually this command doesnt output a single 256 element array, the EMBEDS variable will be (#,256) but it should be (1,256). I think its first splitting the audio into segments and then finding embeddings but it should use the entire audio. Any clue how to do that?

saumyaborwankar avatar Feb 11 '21 04:02 saumyaborwankar

Actually I found out @davodesign84 and @zhs105 you can just call embed = encoder.embed_utterance(wav) and itll give you a (1,256) array which is your embedding for the specific wav file.

saumyaborwankar avatar Feb 11 '21 04:02 saumyaborwankar