rhasspy [Feature] Detect person

[Feature] Detect person

Open jwillmer opened this issue 5 years ago • 2 comments

trafficstars

It would be most useful if we can train the system to differentiate who said something. Depending on the person we could then start or ignore a command. For instance:

a guest in the house can't reorder (buy) supplies by talking to the voice assistant
the kids can't start movies via voice assistant if the movie is not for there age
..

Mar 12 '20 21:03 jwillmer

Kaldi apparently supports this through something called "x-vectors". I'd be interested to add this, but I haven't had time to look into how to do a basic "WAV files + labels" training for classification.

BTW, the kids activating Rhasspy are why I can't really use it at home much :/

Mar 28 '20 19:03 synesthesiam

I’ve tested Kaldi « i-vectors » for speaker identification but it needs a LOT of training data to approach a satisfactory error rate (a few hundred short WAVs per user is apparently the minimum).

The best I got with around 5 samples per user was a 24% error rate following this : http://jrmeyer.github.io/asr/2017/09/29/challenge.html

The « x-vectors » add some improvements but they still needs like hundreds of samples per user to perform correctly (like 7-8% ER)

It would be pretty awesome to achieve speaker identification though 😊

Mar 28 '20 19:03 mathquis

rhasspy rhasspy copied to clipboard

[Feature] Detect person

rhasspy
rhasspy copied to clipboard