openpilot icon indicating copy to clipboard operation
openpilot copied to clipboard

[$200 bounty] Voice entry demo

Open adeebshihadeh opened this issue 1 year ago • 2 comments

We want to use voice entry for setting nav destinations and more. The goal of this is to get a small, self-contained demo working of detecting a phrase like "Hey comma, navigate home". The script will run forever: detect start of speech, detect end of speech, do speech to text, then print it out.

Requirements:

  • wake word detection needs to be local and fast
  • speech to text can use an API
  • must be low latency (<1s)
  • new dependencies must be used only when necessary
  • doesn't use any GPU; CPU and DSP are available though
  • has to work reliably in the expected use case
    • mounted on the windshield and speaking
    • while openpilot is onroad (everything running)
    • can't make the rest of openpilot lag

See how we're running other models on the device: https://github.com/commaai/openpilot/tree/master/selfdrive/modeld

adeebshihadeh avatar Jan 02 '24 01:01 adeebshihadeh

Can you explain this part? ""The goal of this is to get a small, self-contained demo working of detecting a phrase like "Hey comma, navigate home"." Do you want standalone script with demo or do you want the script to be in the repo but work separately for a demo?

singh13apoorv avatar Jan 02 '24 18:01 singh13apoorv

https://github.com/meirdev/commaai-assistant (with Whisper)

meirdev avatar Jan 02 '24 19:01 meirdev

https://github.com/commaai/openpilot/pull/31010

jakethesnake420 avatar Jan 15 '24 22:01 jakethesnake420