agents
agents copied to clipboard
livekit-plugin-elevenlabs TTS stalls when selecting a custom voice.
Having a really hard time getting the Elevenlabs voice TTS to work with a dedicated/cloned voice. I'm using the voice-pipeline-agent-python example. It does work with the standard openai.TTS voice.
Elevenlabs.TTS with the standard voice, just set with the API key and model, also works:
assistant = VoicePipelineAgent(
vad=ctx.proc.userdata["vad"],
stt=openai.STT(),
llm=openai.LLM(model="Meta-Llama-3.1-8B-Instruct", api_key="xxxx"),
tts=elevenlabs.TTS(model_id="eleven_turbo_v2_5", api_key=os.getenv("ELEVEN_API_KEY")),
chat_ctx=initial_ctx,
)
Then if I try to work with a special cloned voice, using (voice=(Voice(id="....") the client never initializes audio (waiting for the Elevenlabs audio) and times out. there are no error messages it seems.
Help much appreciated!
code:
import logging
import os
from dotenv import load_dotenv
from livekit.agents import (
AutoSubscribe,
JobContext,
JobProcess,
WorkerOptions,
cli,
llm,
)
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import openai, deepgram, silero, elevenlabs
from livekit.plugins.elevenlabs import TTS, Voice, VoiceSettings
load_dotenv(dotenv_path=".env.local")
logger = logging.getLogger("voice-agent")
def prewarm(proc: JobProcess):
proc.userdata["vad"] = silero.VAD.load()
async def entrypoint(ctx: JobContext):
initial_ctx = llm.ChatContext().append(
role="system",
text=(
"""You are a helpful assistant """
),
)
logger.info(f"connecting to room {ctx.room.name}")
await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
# Wait for the first participant to connect
participant = await ctx.wait_for_participant()
logger.info(f"starting voice assistant for participant {participant.identity}")
assistant = VoicePipelineAgent(
vad=ctx.proc.userdata["vad"],
stt=openai.STT(),
llm=openai.LLM(model="Meta-Llama-3.1-8B-Instruct", api_key="xxxx"),
tts = elevenlabs.TTS(
voice=Voice(id="xxx", settings=VoiceSettings(stability=0.71, similarity_boost=0.5)),
model="eleven_turbo_v2_5",
api_key=os.getenv("ELEVEN_API_KEY")
),
chat_ctx=initial_ctx,
)
assistant.start(ctx.room, participant)
# The agent should be polite and greet the user when it joins :)
await assistant.say("hi i am an assistant..", allow_interruptions=True)
if __name__ == "__main__":
cli.run_app(
WorkerOptions(
entrypoint_fnc=entrypoint,
prewarm_fnc=prewarm,
),
)