esp-sr icon indicating copy to clipboard operation
esp-sr copied to clipboard

Want to suggest a wake word? Leave your thoughts here. (AIS-1441)

Open feizi opened this issue 1 year ago • 97 comments

Hi all,

We're excited to offer the community more free and high-quality wake word models. Everyone has their own unique wake word preferences. Now, we're ready to regularly release some of the most popular wake words. Please let us know the wake words you want! English and Chinese are both welcome.

In the past, it was an expensive process to collect high-quality human speech data. But now, our team has developed a cost-effective way to train wake word models by using only TTS samples, which reaches 90-95% accuracy compared to models trained by human-recorded samples.

The wake word models and esp-sr have the same license and are free for commercial use. If you want a more accurate and exclusive wake word, please use our wake word customization service.

feizi avatar Dec 14 '23 03:12 feizi

The Willow team and community would love "Hey Willow". It's our domain name because we've been waiting for this.

Thank you very much for offering this option, it's very exciting!

kristiankielhofner avatar Dec 14 '23 11:12 kristiankielhofner

The Willow team and community would love "Hey Willow". It's our domain name because we've been waiting for this.

Thank you very much for offering this option, it's very exciting!

I'm glad you like this. Since "hey" and "hi" sound pretty similar, sometimes people might not really notice the difference. So, I was thinking, maybe we could support both "hey willow" and "hi willow" for waking up the device. That way, whether you say "hey willow" or "hi willow", it'll still work. Of course, when we release the wake word model, we'll call it like "wn9_heywillow". What do you think about that?

feizi avatar Dec 14 '23 12:12 feizi

Good idea!

My only concern would be overall reduced accuracy (wake reliability vs false wake). We've noticed quite a bit of false wake with Alexa. From what I've read the automated TTS approach has 90-95% the accuracy of the models trained on human samples. I like "two word" wake words because they tend to improve accuracy, I suspect a 100% "Hey Willow" wake word could result in equivalent or even improved accuracy with the TTS approach vs even human sample trained Alexa?

Of course we could always test this, even starting with a pure "Hey Willow" model, a pure "Hi Willow" model, and a merged model.

Thanks again for offering this!

kristiankielhofner avatar Dec 14 '23 12:12 kristiankielhofner

Your concern may indeed happen. We will generate two words and test which model performs better.

feizi avatar Dec 14 '23 12:12 feizi

"hey/hi willow" model: Model name: wn9_heywillow_tts FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 88%

Test dataset description: The FAR dataset: This dataset contains a total of 64 hours of audio data, which includes audio collected from the internet and audio recorded using esp32-korvo boards. The RAR dataset: This dataset is generated by multiple commercial TTS APIs, with a total of approximately 500 samples. These data and models were not used in the training process. However, due to the differences between TTS samples and human samples, please exercise caution when referring to the test results.

feizi avatar Dec 28 '23 07:12 feizi

Guys, what you are doing is really great. We have created a smart speaker called Homai based on the esp32-s3. We trained the model ourselves, but it is resource-intensive and not so easy to integrate into the pipeline. Could you please add support for our word Homai [ho'mai]? Thank you in advance!

AigizK avatar Dec 29 '23 06:12 AigizK

Hi @AigizK , The syllable of Homai only has two. It is difficult to reduce the probability of false triggering for monosyllabic and disyllabic phrases. We recommend selecting a 3-5 syllable phrase as the wake word.

sun-xiangyu avatar Jan 03 '24 02:01 sun-xiangyu

Hi @sun-xiangyu We have already launched a project with this name, so we can't change it significantly. But can we use the variant "homa ai", where the sound 'A' is pronounced long?

AigizK avatar Jan 03 '24 13:01 AigizK

We have already launched a project with this name, so we can't change it significantly. But can we use the variant "homa ai", where the sound 'A' is pronounced long?

I'm sorry that our TTS model cannot specify a syllable to extend its pronunciation at the moment. This means that we cannot generate a large number of accurate “homa ai” phrases.

sun-xiangyu avatar Jan 04 '24 03:01 sun-xiangyu

Hi! Thank you for this awesome solution! We are developing a smart voice assistant called Sophia. Would it be possible to have the wake word "Hi Sophia"? This would help our user experience drastically. Thank you in advance!

PrathamG avatar Jan 09 '24 06:01 PrathamG

Hi @PrathamG , I'm glad you like it. "Sophia" sounds like a wake word that can be used directly. I mean, maybe we don't need an extra prefix "Hi". I suggest we start with just "Sophia". If the performance is not satisfactory, then we can train another one with "hi Sophia". What do you think?

sun-xiangyu avatar Jan 09 '24 07:01 sun-xiangyu

Sure, that sounds like a good plan! We can use only "Sophia" and test the performance first. Thank you

PrathamG avatar Jan 09 '24 08:01 PrathamG

If possible, I also wanted to request the wake word "Little Sophia". We are still unsure about which wake word to use, and having both options will help us determine this via user testing.

PrathamG avatar Jan 09 '24 09:01 PrathamG

If possible, I also wanted to request the wake word "Little Sophia". We are still unsure about which wake word to use, and having both options will help us determine this via user testing.

Now our computing resources are limited. This project can generate about two wake word models in a month. So we will choose some popular wake words. Of course, if we have some free time, "Little Sophia" is also fine.

sun-xiangyu avatar Jan 10 '24 03:01 sun-xiangyu

No worries, totally understandable! Looking forward to testing out the "Sophia" wake word

PrathamG avatar Jan 12 '24 04:01 PrathamG

"Sophia" model: wn9_sophia_tts

FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 97%

sun-xiangyu avatar Jan 18 '24 09:01 sun-xiangyu

“小美” or “小美同学” would be a perfect choice. It will suit a lot of use case. We all want wake word like a human name.

xygh avatar Jan 22 '24 12:01 xygh

@xygh, “小美同学” sounds good.

sun-xiangyu avatar Jan 23 '24 06:01 sun-xiangyu

"Sophia" model: wn9_sophia_tts

FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 97%

Thank you! We will test it out and report the results by next week

PrathamG avatar Jan 23 '24 07:01 PrathamG

@xygh, “小美同学” sounds good.

BTW, “你好小美” is also a perfect choice.

xygh avatar Jan 23 '24 09:01 xygh

"小当家" or "Hi 小星" is preferable wake word in our scenario. Thanks a lot!

Henry586 avatar Jan 25 '24 05:01 Henry586

The second version "Sophia": model info: wakenet9l_tts1h8v2_Sophia_3_0.647_0.649

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 95%

Improvement: Add "Sophie" and "Sophy" as hard negatives to reduce false triggers.

sun-xiangyu avatar Jan 25 '24 11:01 sun-xiangyu

"小当家" or "Hi 小星" is preferable wake word in our scenario. Thanks a lot!

Both of these words sound good. If you have no preference, we will choose "hi 小星".

sun-xiangyu avatar Jan 25 '24 11:01 sun-xiangyu

"小美同学" model info: wakenet9l_tts1h8_小美同学_3_0.633_0.644

FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 95%

feizi avatar Jan 30 '24 06:01 feizi

Hello! This is a great opportunity I was hoping would come up, I'm so glad this is now possible! I've seen that the wake-words "Mycroft" and "Hey, Mycroft" are very popular in the community, and it is also the name of my product so would very much improve user experience. Would it be possible to have either of these trained and released for the community? Thank you so much in advance for this!

lewardo avatar Feb 11 '24 15:02 lewardo

@lewardo, I'm glad it could help you. Although "Mycroft" is simpler, it seems there are quite a few words that sound similar, so I'll prioritize training with "Hey Mycroft."

sun-xiangyu avatar Feb 19 '24 03:02 sun-xiangyu

@Henry586 ,

Hi,小星: wakenet9l_tts1h8_Hi,小星_3_0.626_0.630

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 93%

sun-xiangyu avatar Feb 19 '24 08:02 sun-xiangyu

I'd love to have "hey printer" available as a wake word/phrase.

jmattsson avatar Feb 27 '24 23:02 jmattsson

I want to suggest a wake word ,"小龙小龙". I'm glad to hear that you can create a wake word.

Littledragon-wxl avatar Feb 29 '24 15:02 Littledragon-wxl

@lewardo , The performance of "Mycroft" also looks good. Pls try. Mycroft: wakenet9l_tts1h8_Mycroft_3_0.625_0.629

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 96%

feizi avatar Mar 04 '24 11:03 feizi