voice icon indicating copy to clipboard operation
voice copied to clipboard

EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS

Open hamzadahmani opened this issue 3 years ago • 14 comments

Is it possible to make the app keep on listening for an unlimited period of time ?

hamzadahmani avatar Apr 30 '21 12:04 hamzadahmani

@hamzadahmani Did you get any solution? I'm looking for the same thing!

panktibardolia119933 avatar May 11 '21 02:05 panktibardolia119933

Did anyone get any solution?

phongtt-international avatar May 12 '21 09:05 phongtt-international

Nope, I have been tried for the last 6 months, didn't find any solution. Finally convinced my client to deal as it is.

Sravansuhas avatar May 19 '21 20:05 Sravansuhas

I am trying to debug this thing rn, I found that

  1. The stopSpeech is stopping the recording, I tried to comment all stopSpeech calling which result in accepting unlimited recording with 1 sentence and recognition (1 array).

  2. I figured out that self.recognitionTask.isFinishing returns true which leads to finish the recording.

  3. There is a variable called self.continuous I cannot find any assigning for it, it's used but not assigned I dont know.

  4. I figured out that the sessionID is negative and it's duplicated per recordings.

I attached a pic of the code that is closing the task

If anyone used Voice API in IOS objective C please help us, lets debug this issue together. I am new here in the package so I am still trying to understand how the IOS SDK works

The documentation if the SDK is here: https://developer.apple.com/documentation/speech/sfspeechrecognitiontask?language=objc

Screen Shot 2021-05-21 at 12 10 28 AM

basemanabulsi avatar May 20 '21 21:05 basemanabulsi

I am trying to debug this thing rn, I found that

  1. The stopSpeech is stopping the recording, I tried to comment all stopSpeech calling which result in accepting unlimited recording with 1 sentence and recognition (1 array).
  2. I figured out that self.recognitionTask.isFinishing returns true which leads to finish the recording.
  3. There is a variable called self.continuous I cannot find any assigning for it, it's used but not assigned I dont know.
  4. I figured out that the sessionID is negative and it's duplicated per recordings.

I attached a pic of the code that is closing the task

If anyone used Voice API in IOS objective C please help us, lets debug this issue together. I am new here in the package so I am still trying to understand how the IOS SDK works

The documentation if the SDK is here: https://developer.apple.com/documentation/speech/sfspeechrecognitiontask?language=objc

Screen Shot 2021-05-21 at 12 10 28 AM

Thank you for the investigation. Will gladly take a PR to fix this from anyone in the community

safaiyeh avatar May 24 '21 01:05 safaiyeh

For now, if anyone faced this issue he/she can fix it by using setTimeout in onSpeechResultsHandler and append the value to an array, because calling this.stopVoice() will fire onSpeechEndHandler(). when we start the method and start talking the package fire 2 methods onSpeechResultsHandler and onSpeechPartialResults and its talking doesnt finish automatically.

for example I finished talking the recognition won't understand that I finished and call onSpeechEndHandler() and if I can't put stop in onSpeechResultsHandler because it will fire after the first word.

This is a temporary solution while fixing this issue.

This is a snippet of the code

onSpeechResultsHandler(event) {
    if (Platform.OS == 'ios') {
      if (event?.value?.length && !this.state.recognizedWords.includes(event?.value[0])) {
        this.setState((prevState) => {
          const temp = [...prevState.recognizedWords];
          temp.push(recognizedWords?.value[0]);
          return {
            recognizedWords: temp,
          };
        });
      }

      setTimeout(() => {
        this.stopVoice();
      }, 6000);

      return null;
    }

basemanabulsi avatar May 24 '21 18:05 basemanabulsi

@basemanabulsi Android Voice Recognizer doesn't provide an API to continously listen: https://developer.android.com/reference/android/speech/RecognizerIntent. Therefore we are supposed to implement it by ourselves. I've found a workaround to make the app listen at all times. Basically, whenever the listener stops, you start it again.

Also it throws a lot of errors, which I'm not sure why since there is not much explanation inside the error message. Therefore we need to re-start the listener again whenever it throws an error.

Here is an example:

import Voice from '@jamsch/react-native-voice';
import React from 'react';

export default class VoiceService {
    constructor() {
        Voice.onSpeechStart = this.onSpeechStartHandler.bind(this);
        Voice.onSpeechEnd = this.onSpeechEndHandler.bind(this);
        Voice.onSpeechResults = this.onSpeechResultsHandler.bind(this);
        Voice.onSpeechRecognized = this.onSpeechRecognizedHandler.bind(this);
        Voice.onSpeechError = this.onSpeechErrorHandler.bind(this);

        Voice.start('en-US', {"EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS": 10000})
    }

    onSpeechErrorHandler = (event) => {
        console.log('onSpeechErrorHandler: ', event);
        Voice.stop();
        Voice.start('en-US');
    };

    onSpeechStartHandler = (event) => {
        console.log('onSpeechStartHandler: ', event);
    };

    onSpeechRecognizedHandler = (event) => {
        console.log('onSpeechRecognizedHandler: ', event);
    };

    onSpeechEndHandler = (event) => {
        console.log('onSpeechEndHandler: ', event);
    };

    onSpeechResultsHandler = (event) => {
        console.log('onSpeechResultsHandler: ', event);
        Voice.stop();
        Voice.start('en-US');
    };
}

There are a couple of parameters which we can set to customize it. A list of all the parameters given in the above link. Some of them, which are implemented in this npm package are:

  • EXTRA_LANGUAGE_MODEL: Informs the recognizer which speech model to prefer when performing ACTION_RECOGNIZE_SPEECH.
  • EXTRA_MAX_RESULTS: Optional limit on the maximum number of results to return.
  • EXTRA_PARTIAL_RESULTS: Optional boolean to indicate whether partial results should be returned by the recognizer as the user speaks (default is false).
  • EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS: The minimum length of an utterance.
  • EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS: The amount of time that it should take after we stop hearing speech to consider the input complete.
  • EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS: The amount of time that it should take after we stop hearing speech to consider the input possibly complete.

erkankarabulut avatar May 24 '21 19:05 erkankarabulut

@erkankarabulut Nice approach but I think it will keep open and close the session, and it won't recognize of you finished talking or not

basemanabulsi avatar May 25 '21 12:05 basemanabulsi

@basemanabulsi Yes it opens and closes the session continuously. One can continuously concat the recognized speakings. If one continue to speak during the close-open process, then a small part(one word mostly) of the speaking will be lost. The recording time must be parameterized definitely. Edit: I updated my comment above.

erkankarabulut avatar May 25 '21 12:05 erkankarabulut

@erkankarabulut yeah good ideas, what I was thinking is to set an array of the words and keep appending to it before calling onSpeechEndHandler from the IOS, and after finish at some point it will fire it. I will try with EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS

Do you have any ideas about it? I am thinking to check if the word is empty for the given seconds it will end the session

basemanabulsi avatar May 26 '21 18:05 basemanabulsi

@basemanabulsi If I understand correctly what you mean, you are trying to stop recording/recognizing when the user stops speaking. But this is already handled by SpeechRecognizer on Android. "onSpeechResultsHandler" method is only called when user stops speaking, never in the middle.

And EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS parameter decides the duration of silent moment after the user stops speaking and then it stops. I think it supposed to be the same on IOS as well.

erkankarabulut avatar May 26 '21 18:05 erkankarabulut

@erkankarabulut The issue that I am trying to solve is in IOS; the recognizer doesn't stop automatically. It keeps listening event if I a stopped talking.

Also, I want to try to add a limit for the recognition based on ur nice idea

basemanabulsi avatar May 26 '21 18:05 basemanabulsi

@basemanabulsi I didn't know that it is problematic on IOS. Maybe you can use onSpeechPartialResults method. It captures partial recognition results. You can call the onSpeechEndHandler method if onSpeechPartialResults is not called for a while and end it.

erkankarabulut avatar May 26 '21 19:05 erkankarabulut