EfficientWord-Net icon indicating copy to clipboard operation
EfficientWord-Net copied to clipboard

complex hotwords support #Current Model Limitations Discussion

Open amoazeni75 opened this issue 2 years ago • 8 comments

Hi, Thanks for your helpful research. I wonder if the current model can handle complex hot words like "Hey Siri" or just handle one word, like "Siri"?

My second question is about hot words that their pronunciation takes more than 1s, like"Hey XXXX." Does your model support changing the recording time?

Did you try to use cosine_similarity instead of Euclidian distance in inference time?

Thanks.

amoazeni75 avatar Apr 10 '22 19:04 amoazeni75

+1 We also want to use an about 2-seconds-long custom hotword, but with the current python -m eff_word_net.generate_reference method, the detection seems to be awkward.

So would like to have support changing the recording time too!

dominickchen avatar Apr 11 '22 04:04 dominickchen

Sorry for the delayed response, the model was currently trained on single words , however its should work in simple phrases like Hey xxx though . Moreover the current model was trained on 1 sec audio clippings , so bizare behviour might occur on trying to process audio clippings greater in length than 1 sec Pushed a commit https://github.com/Ant-Brain/EfficientWord-Net/commit/c9dee140c6cc44c2adf985f42519e382ee0d0eab expanining the same

The model was trained using Euclidean distance hence works on the same during inference time too

Coming to increasing hotword length, hotwords are usually small , may be we can extend the processing window to 1.5 sec , but 2 sec I am not really sure . Can you give a few examples where a hotword could be greater than 1.5 secs?

Kindly give you additional model suggestions in discussions page https://github.com/Ant-Brain/EfficientWord-Net/discussions/3

Join the same channel and put forward you queries there , planning to create faster , more performant version of current implimentation soon, your suggestions will be helpful

TheSeriousProgrammer avatar Apr 15 '22 17:04 TheSeriousProgrammer

Thanks for the Information. An example of a long wake word is "Hey MercedesBenz". Could you please provide the training steps?

amoazeni75 avatar Apr 15 '22 18:04 amoazeni75

sorry for the delay , didn't have time to clean the repository which held the training code , the same is built using keras https://github.com/Ant-Brain/wakeword_dataset_generator . It has both the training code and dataset generator code

TheSeriousProgrammer avatar Apr 17 '22 17:04 TheSeriousProgrammer

Hey, thanks for this repo.

I can not find your training code here https://github.com/Ant-Brain/wakeword_dataset_generator . is it available in any other repo?

Durgesh92 avatar Jun 30 '22 18:06 Durgesh92

Extremely sorry for the delay, my bad forgot to add the notebook which contained the training code https://colab.research.google.com/drive/1hH6q3cGneIWxNRLwbVAKIBzHoVVFlEO3?usp=sharing

TheSeriousProgrammer avatar Jul 14 '22 06:07 TheSeriousProgrammer

Currently working on a newer model with better perfomance and higher hotword length, will be available in a month's time

TheSeriousProgrammer avatar Jul 14 '22 06:07 TheSeriousProgrammer

Update

A newer model with better resilience to noise, 1.5 secs window support has been added to the flow . kindly check it out!! (its taken more than a month for the update XD )

TheSeriousProgrammer avatar Apr 14 '23 11:04 TheSeriousProgrammer