frigate icon indicating copy to clipboard operation
frigate copied to clipboard

Audio Classification via Tensorflow Lite

Open anaisbetts opened this issue 3 years ago • 7 comments

Describe what you are trying to accomplish and why in non technical terms

Instead of throwing away the audio that is captured by Frigate, it would be cool to be able to use similar ML detection tricks on the audio to build binary sensors to detect various sounds heard by the cameras

Describe the solution you'd like

Similar to ImageNet, YamNet is a free implementation of recognizing a bunch of different audio samples. Here's the total list of sounds that it detects.

Some interesting ones:

  • Speech / voices
  • Vehicle / Car
  • Various Animals (Bird/Cat/Dog)
  • The sound of something dropping or falling
  • Sirens and alarms (a similar feature exists on many commercial camera platforms)
  • Emergency Vehicle
  • Finger Snapping / Clapping, reimplement the most extra Clapper you've ever seen 😅

Describe alternatives you've considered

This could be built standalone but it would also require the Edge TPU to be performant and I suspect there is no resource-sharing between Edge TPUs (i.e. you open it for exclusive access), as well as requiring extra ffmpeg processes => extra CPU usage. This would also involve recreating the infrastructure to talk to Home Assistant as well

anaisbetts avatar Sep 28 '21 19:09 anaisbetts

maybe should be designed as a second AI-level? means "chaining" AI? like its discussed for utilitize also other AI-models tor object classification in frigate?

ozett avatar Sep 29 '21 08:09 ozett

@ozett I don't think that these should be chained in most cases, but it definitely might be related to #1697 in terms of the infrastructure

anaisbetts avatar Sep 29 '21 12:09 anaisbetts

baby crying! it is must to have, it will simply add electronic nanny functionality out of box

Baael avatar Nov 08 '21 09:11 Baael

Adding on that this feature would ideally be a foundation to support speech transcription and voice fingerprint, in addition to sound labeling.

Sound labels should include gunshot

markfrancisonly avatar Jun 07 '22 17:06 markfrancisonly

I would love to see this, and while I don't have any experience coding in frigate (I just started playing with it), I would love to help if someone can point me in the right direction

arthurdarcet avatar Sep 02 '22 09:09 arthurdarcet

I would love to see this, and while I don't have any experience coding in frigate (I just started playing with it), I would love to help if someone can point me in the right direction

https://docs.frigate.video/contributing

There's a lot to be explored for this feature, PRs are appreciated

NickM-27 avatar Sep 02 '22 11:09 NickM-27

I'd gladly place a bounty of $500 for someone to implement this, would love to use Frigate as a baby monitor.

jespern avatar Sep 05 '22 11:09 jespern

Added in #6848

NickM-27 avatar Jul 01 '23 13:07 NickM-27

Amazing! Good work @NickM-27

anaisbetts avatar Jul 01 '23 18:07 anaisbetts