Resemblyzer d-Vectors for UIS-RNN

d-Vectors for UIS-RNN

Open arthavmane opened this issue 5 years ago • 4 comments

I'm working on a project in which I want to use d-vector embeddings to train a model. Can someone please help how to compute d-vectors for different utterances from different speakers to pass into the UISRNN model?

May 06 '20 09:05 arthavmane

Hi @arthavmane , Did you find a way to get the d-vectors as I am working on a similar project?

Sep 22 '20 07:09 zhs105

I haven't tried UIS-RNN yet and only found this library yesterday but I can extract the embeds with _, EMBEDS, wav_splits = encoder.embed_utterance(wav, return_partials=True)

Oct 07 '20 13:10 davide-scalzo

@davodesign84 actually this command doesnt output a single 256 element array, the EMBEDS variable will be (#,256) but it should be (1,256). I think its first splitting the audio into segments and then finding embeddings but it should use the entire audio. Any clue how to do that?

Feb 11 '21 04:02 saumyaborwankar

Actually I found out @davodesign84 and @zhs105 you can just call embed = encoder.embed_utterance(wav) and itll give you a (1,256) array which is your embedding for the specific wav file.

Feb 11 '21 04:02 saumyaborwankar

Resemblyzer Resemblyzer copied to clipboard

d-Vectors for UIS-RNN

Resemblyzer
Resemblyzer copied to clipboard