python-sdks
python-sdks copied to clipboard
How do I play a pre-recorded message in the entrypoint?
Based on the examples, it's normal to have a greeting at the end of the entrypoint. Something like:
await agent.say("Welcome, I'm a friendly assistant...", allow_interruptions=True)
This message is repetitive and re-generating($$$) it every time is just burning tokens for no good reason. How do I play a pre-recorded message from the agent here?
I think I can reverse engineer VoicePipelineAgent.say(...) and inject a wav there but I'm curious if there's an easier way.
thanks claude - this is ugly but it works.
async def play_greeting_file(local_participant, wav_path: str = "greeting.wav"):
# Read WAV file
with wave.open(wav_path, 'rb') as wav_file:
# Get wav file properties
sample_rate = wav_file.getframerate()
num_channels = wav_file.getnchannels()
sample_width = wav_file.getsampwidth()
print(f"Audio properties: rate={sample_rate}, channels={num_channels}, width={sample_width}")
# Create audio source with matching parameters
audio_source = AudioSource(
sample_rate=sample_rate,
num_channels=num_channels,
queue_size_ms=5000 # 5 second buffer
)
# Create and publish track
track = LocalAudioTrack.create_audio_track("greeting", audio_source)
await local_participant.publish_track(track)
# Add a small delay to ensure everything is ready
await asyncio.sleep(0.5)
# Read and send audio data
chunk_size = sample_rate // 10 # 100ms chunks
while True:
raw_data = wav_file.readframes(chunk_size)
if not raw_data:
break
# Just pass the raw PCM data
samples = np.frombuffer(raw_data, dtype=np.int16)
frame = AudioFrame(
data=raw_data,
sample_rate=sample_rate,
num_channels=num_channels,
samples_per_channel=len(samples) // num_channels
)
await audio_source.capture_frame(frame)
# Wait for audio to finish playing
await audio_source.wait_for_playout()
# Cleanup
await audio_source.aclose()
await local_participant.unpublish_track(track.sid)
await play_greeting_file(ctx.room.local_participant)
In the upcoming 1.0 agents release, you will be able to play audio in say:
You can see the code in the agents repo under the dev-1.0 branch.
That’s awesome!