wavenet icon indicating copy to clipboard operation
wavenet copied to clipboard

Global conditioning on speaker identification

Open basveeling opened this issue 9 years ago • 3 comments

And perhaps using a keras embedding layer to learn a representation for speakers?

basveeling avatar Sep 27 '16 10:09 basveeling

I'm actually very interested in this as well. Just to clarify, are you referring to something like FaceNet for voice? Have you done any more research into this area as of late?

malzzz avatar Jan 13 '17 15:01 malzzz

I haven't, but this would be interesting! I don't have any access to NN training hardware right now, but I'd love to see if this works.

basveeling avatar Feb 06 '17 20:02 basveeling

would this paper be relevant here?

screen shot 2017-03-04 at 18 28 25

Also there are a few deep embedded clustering implementations around. Also one in Keras: https://github.com/fferroni/DEC-Keras but I don't know if this one is well tested

faroit avatar Mar 04 '17 17:03 faroit