agents
agents copied to clipboard
Azure config param throws error.
Hi, thank you for this nice tool.
I was using Azure TTS and set some custom config but getting error from the lib packages. I was curious how to tackle with this issue?
my current code snippet:
from livekit.plugins.azure.tts import ProsodyConfig
config = ProsodyConfig(rate="fast") # this is the config file we can set for Azure
async def entrypoint(ctx: JobContext):
initial_ctx = llm.ChatContext().append(
role="system",
text=prompt,
)
await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
# Wait for the first participant to connect
participant = await ctx.wait_for_participant()
assistant = VoicePipelineAgent(
vad=ctx.proc.userdata["vad"],
stt=deepgram.STT(language="ko"),
llm=openai.LLM(model="gpt-4o-mini"),
tts=azure.TTS(
voice="ko-KR-JiMinNeural",
language="ko-KR",
prosody=config
),
chat_ctx=initial_ctx,
)
++++++++++++++++++++ below is Azure TTS class and we can pass the config here:+++++++++++++
class TTS(tts.TTS):
def __init__(
self,
*,
speech_key: str | None = None,
speech_region: str | None = None,
voice: str | None = None,
endpoint_id: str | None = None,
language: str | None = None,
prosody: ProsodyConfig | None = None,
) -> None:
+++++++++++++++++ when I do so there is a ValueError ++++++++++++++++++++
raise ValueError(
ValueError: failed to synthesize audio: ResultReason.Canceled: CancellationReason.Error (Connection was closed by the remote host. Error code: 1007. Error details: Ssml should contain at least one [VOICE] tag. USP state: TurnStarted. Received audio size: 0 bytes.)
would appreciate any help for this.
This is a problem because there is no voice tag in ssml of azure tts.
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/tts.py
This is resolved by changing the _synthsize function of 180 lines.
def _synthesize() -> speechsdk.SpeechSynthesisResult:
if self._opts.prosody:
ssml = f'<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="{self._opts.language or "en-US"}">'
voice_ssml = f'<voice name="{self._opts.voice}">'
prosody_ssml = "<prosody"
if self._opts.prosody.rate:
prosody_ssml += f' rate="{self._opts.prosody.rate}"'
if self._opts.prosody.volume:
prosody_ssml += f' volume="{self._opts.prosody.volume}"'
if self._opts.prosody.pitch:
prosody_ssml += f' pitch="{self._opts.prosody.pitch}"'
prosody_ssml += ">"
ssml += voice_ssml
ssml += prosody_ssml
ssml += self._text
ssml += "</prosody></voice></speak>"
return synthesizer.speak_ssml_async(ssml).get() # type: ignore
return synthesizer.speak_text_async(self._text).get() # type: ignore
Hey, seems like this was fixed inside this PR https://github.com/livekit/agents/pull/929
Unfortunately, it did not solve the problem. I still cannot pass config file. what might be I am doing wrong in here? thank you
https://github.com/livekit/agents/pull/929 Now that the change has been merged, why don't you get the project source code again and run it?
thank you @harmlessman