Adjust `end of speech` timeout on streaming speech to text

Open piotrgregor opened this issue 7 months ago • 0 comments

What component of google-cloud-cpp is this feature request for?

google/cloud/speech/speech_client

Is your feature request related to a problem? Please describe.

I would like to reduce timeout it takes on Google before end of speech is detected

Describe the solution you'd like

I would like to have parameter added to streaming recognition config (speech::v1::StreamingRecognizeRequest) that would setup this timeout, similarly to what EndSilenceTimeout is on MSFT (https://learn.microsoft.com/en-us/windows/apps/design/input/set-speech-recognition-timeouts)

End Silence Timeout: This timeout is triggered after a phrase has been successfully recognized, and the service waits for further speech input before finalizing the recognition result. The EndSilenceTimeout property determines how long the service waits for additional speech after a recognized phrase before concluding that the speech input has ended. This timeout can be adjusted to accommodate various speaking styles, allowing for short pauses within a longer phrase without prematurely ending the recognition.

Describe alternatives you've considered

No alternatives are known to exist, but please let me know otherwise

Additional context

Request originates from work on AI voice assistants where I would like to experiment with slightly reduced end of speech timeouts

Jun 13 '25 17:06 piotrgregor