google-cloud-php
google-cloud-php copied to clipboard
Speech API: Cannot use explicit_decoding_config with encoding = ENCODING_UNSPECIFIED
Hello,
I want to use the speech API to convert speech into text.
TL;DR
Using:
$explicitConfig = new Google\Cloud\Speech\V2\ExplicitDecodingConfig([
'encoding' => Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::ENCODING_UNSPECIFIED,
'sample_rate_hertz' => 16000,
]);
Throws that error:
Invalid audio channel count value: 0. Values must be non-negative.
While using:
$explicitConfig = new Google\Cloud\Speech\V2\ExplicitDecodingConfig([
'encoding' => Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::ENCODING_UNSPECIFIED,
'sample_rate_hertz' => 16000,
'audio_channel_count' => 2,
]);
Throws that error:
The RecognitionConfig proto is invalid:
* explicit_decoding_config.audio_channel_count: audio_channel_count isn't supported by the set encoding
Long and detailed version for the courageous ones :)
Environment details
- OS: MacOS Sonoma 14.3 (23D56)
- PHP version: PHP 8.2.17
- Package name and version: google/cloud-speech 1.18.2
Steps to reproduce
I'm working on audio .aac files (generated by Instagram).
I tried the online GUI (https://console.cloud.google.com/speech/transcriptions) to try if the .acc file would be supported and it worked =>
When using the GUI, after uploading the file I have a warning Unable to automatically detect audio information. Please review your audio file and enter the relevant fields manually.
So I fill fields manually:
- Encoding = ENCODING_UNSPECIFIED
- Sample rate = 16000
- Channel count remains empty
This worked as shown on the screenshot above.
Then I wanted to do the same thing by code using the google/cloud-speech package
.
I tried to use the auto_decoding_config option but got the following error:
Audio data does not appear to be in a supported encoding. If you believe this to be incorrect, try explicitly specifying the decoding parameters.
Which is the same behavior as the GUI.
So I tried to use the explicit_decoding_config
parameter and it failed.
See code below.
Code example
$audioFile = 'https://lookaside.fbsbx.com/ig_messaging_cdn/?asset_id=374095301647771&signature=AbxHJBUywVeA26a-1lSTIeODgXgrAsmxD7pCjaxDo7nNowZZvgE_3fC5jMA3H-9UX7AtT7vdNe3N772RgQpNbgBsvmfp3eT439xW14QykJsqVfvg0aC_GVOJ6sBLBhqDyEzDv7Vt08pCStD0dHvG7PHcL7Gp4RvddKRT_TSYVBQP3PTFPiECX9PsMK528lRG4FaYYIAXN4sBcyeIZsRK6EiiWxo_6g';
$client = new Google\Cloud\Speech\V2\Client\SpeechClient();
$content = file_get_contents($audioFile);
$explicitConfig = new Google\Cloud\Speech\V2\ExplicitDecodingConfig([
'encoding' => Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::ENCODING_UNSPECIFIED,
'sample_rate_hertz' => 16000,
]);
$config = new Google\Cloud\Speech\V2\RecognitionConfig([
'explicit_decoding_config' => $explicitConfig,
'language_codes' => ['en-EN'],
'model' => 'latest_long',
]);
$request = new RecognizeRequest([
'recognizer' => 'projects/{MY_PROJECT_ID}/locations/global/recognizers/_',
'config' => $config,
'content' => $content,
]);
$response = $client->recognize($request);
$results = $response->getResults();
foreach ($results as $result) {
$alternatives = $result->getAlternatives();
$mostLikely = $alternatives[0];
$transcript = $mostLikely->getTranscript();
$confidence = $mostLikely->getConfidence();
printf('Transcript: %s' . PHP_EOL, $transcript);
printf('Confidence: %s' . PHP_EOL, $confidence);
}
This code throw the following error:
Invalid audio channel count value: 0. Values must be non-negative.
And setting the audio channel like this:
$explicitConfig = new Google\Cloud\Speech\V2\ExplicitDecodingConfig([
'encoding' => Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::ENCODING_UNSPECIFIED,
'sample_rate_hertz' => 16000,
'audio_channel_count' => 2,
]);
Throw that error:
The RecognitionConfig proto is invalid:
* explicit_decoding_config.audio_channel_count: audio_channel_count isn't supported by the set encoding
Thanks for your help.
Regards, Johann