[Feat]: google tts streaming with async client
Saw this example using Google TTS streaming with Chirp3. I'm trying to do something similar with the async client but running into issues. Only the first sentence plays from iterator
Here’s a minimal repro:
import asyncio
from google.cloud import texttospeech
async def process_streaming_synthesis():
client = texttospeech.TextToSpeechAsyncClient()
streaming_config = texttospeech.StreamingSynthesizeConfig(
voice=texttospeech.VoiceSelectionParams(
name="en-US-Chirp3-HD-Charon",
language_code="en-US",
),
streaming_audio_config=texttospeech.StreamingAudioConfig(
audio_encoding=texttospeech.AudioEncoding.OGG_OPUS
),
)
config_request = texttospeech.StreamingSynthesizeRequest(streaming_config=streaming_config)
text_iterator = [
"Hello there.",
"How are you today?",
"It's such nice weather outside.",
]
async def request_generator():
yield config_request
for text in text_iterator:
await asyncio.sleep(0) # yield control to event loop
yield texttospeech.StreamingSynthesizeRequest(
input=texttospeech.StreamingSynthesisInput(text=text)
)
with open("output.ogg", "wb") as audio_file:
stream = await client.streaming_synthesize(request_generator())
async for response in stream:
audio_file.write(response.audio_content)
print("Complete audio saved to output.ogg")
def main():
asyncio.run(process_streaming_synthesis())
if __name__ == "__main__":
main()
Would really appreciate any pointers or working example. Thankss!
PTAL: @inardini @holtskinner
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Bumping this @holtskinner @inardini
@holtskinner @inardini - Just spoke to Google engineering internally and it seems this fix will cut down latency from ~800ms to ~200 ms. Can you please help who can expedite this resolution?
@jayeshp19 - As per engg, this is a known issue internally when you use an audio_encoding other than PCM in TTS streaming.
Can you please remove OGG_OPUS and use PCM please?
@manishkjs1 confirmed this has been fixed. feel free to close the issue now
Thanks @davidzhao , but I think there is a little confusion here.
This issue is to rather introduce {StreamingSynthesizeRequest} capability in tts.py module of Google plugin, I think that is still pending. Would @jayeshp19 continue to work on it?
Closing this one as Google supports streaming with PCM encoding. Thanks @manishkjs1