speech_recognition
speech_recognition copied to clipboard
Adds Parameter use_enhanced and model to GoogleCloudSpeech
Adds the parameters use_enhanced and model to the recognize_google_cloud method for more customizable options for the user and better results in specific cases
Hello @Uberi and @ftnext, i was wondering if it's possible for someone to review my merge request.
Thank you very much, Vitor Hideyoshi.
Hello @ftnext, is there any interest in this feature? It doesn't break any of GoogleCloudSpeech python api, only extends it. I'm currently already using this implementation in the company i work in, but would love to have this feature merged. If there is anything blocking the merge please tell me :)
Hi @HideyoshiNakazone!
Looks good overall, but would it be possible to document these parameters in the docs for that function? If so, happy to merge this!
@Uberi, thanks a lot! I added the parameters to the Docstring of the method Recognizer.recognize_google_cloud
and added them to the library reference file.
If there is any other places you'd like me to add documentation i'll be happy to :)
@HideyoshiNakazone Thank you very much for this pull request! I'm very sorry to respond too late. @Uberi Thanks your comment!
In my opinion, it seems to be better to introduce keyword arguments (a.k.a. **kwargs
)
https://docs.python.org/3/tutorial/controlflow.html#keyword-arguments
Certainly, adding use_enhanced
and model
as arguments would implement this feature.
However, if there are additional arguments to be added in the future, there is a concern that they could be added again (not easy to extend).
I think it would be preferable for Cloud Speech API-specific arguments to be specified as variant keyword arguments.
def recognize_google_cloud(self, audio_data, credentials_json=None, language="en-US", preferred_phrases=None, show_all=False, **api_params):
"""
If ``preferred_phrases`` is an iterable of phrase strings, ...
api_params: Cloud Speech API-specific parameters as dict (optional)
The ``use_enhanced`` is a boolean option ...
Furthermore, you can use the option ``model`` to set your desired model,
Returns the most likely transcription if ``show_all`` is False (the default).
"""
config = {
'encoding': speech.RecognitionConfig.AudioEncoding.FLAC,
'sample_rate_hertz': audio_data.sample_rate,
'language_code': language,
**api_params,
}
(It seems that preferred_phrases
might be included in api_params
too, but this is another issue)