pipecat feat: add Vonage Audio Connector integration (serializer, transport, foundational example)

Summary

This PR introduces the Vonage Audio Connector integration including a custom serializer, the VonageAudioConnectorTransport + VonageAudioConnectorOutputTransport and a foundational example.

Changes

Added foundational example: examples/foundational/49-vonage-audio-connector-openai.py
Added VonageFrameSerializer under src/pipecat/serializers/vonage.py
Added VonageAudioConnectorTransport and VonageAudioConnectorOutputTransport under src/pipecat/transports/vonage/audio_connector.py
Added new package folder src/pipecat/transports/vonage/ with __init__.py
Updated env.example
Updated pyproject.toml and uv.lock

Why This Is Needed

This integration enables Pipecat to work with the Vonage Voice API Audio Connector supporting real-time STT → LLM → TTS pipelines and will be used to expand the ecosystem of community-maintained integrations.

Testing

Basic end-to-end pipeline validated (audio in → STT → LLM → TTS → audio out)
Serializer and transport tested for encoding/decoding correctness
Verified pacing behavior (sleep-per-chunk timing) matches Vonage Audio Connector requirements
Confirmed WAV-header wrapping when enabled

Nov 21 '25 12:11 varunps2003

Hi @jamsea I’ve created the PR for the Vonage Audio Connector integration (serializer, transport, foundational example).
Please take a look whenever you get a chance — happy to make any changes needed. Thanks!

Nov 21 '25 12:11 varunps2003

I’ve just pushed a follow-up commit to switch the foundational example from the dev OpenTok API URL to the production https://api.opentok.com.

Nov 21 '25 13:11 varunps2003

Hi @markbackman and @filipi87 Can you please find sometime to review this PR.

Dec 02 '25 02:12 varunps2003

Sorry for the delay. We're backlogged on PR reviews. I took a quick look at this and think it's a good plan to split it up. First, can you create a PR for only the VonageFrameSerializer? Along with this, it would be helpful to submit an example for pipecat-examples showing how to dial-in and dial-out. This would be similar to the examples that exist for Twilio, Telnyx, Plivo, and Exotel.

That's a big enough change to add and test that I think we should start there. It will also help developers get started right away as they can easily test and run the example. WDYT?

The VonageFrameSerializer should be written to work with the FastAPIWebsocketTransport. Is there a reason to add a new websocket transport to work specifically with the VonageFrameSerializer?

Dec 05 '25 04:12 markbackman

Sorry for the delay. We're backlogged on PR reviews. I took a quick look at this and think it's a good plan to split it up. First, can you create a PR for only the VonageFrameSerializer? Along with this, it would be helpful to submit an example for pipecat-examples showing how to dial-in and dial-out. This would be similar to the examples that exist for Twilio, Telnyx, Plivo, and Exotel.

That's a big enough change to add and test that I think we should start there. It will also help developers get started right away as they can easily test and run the example. WDYT?

The VonageFrameSerializer should be written to work with the FastAPIWebsocketTransport. Is there a reason to add a new websocket transport to work specifically with the VonageFrameSerializer?

Hi @markbackman thank you so much for your initial review comments. Please find the reasons to keep the transport + foundational example along with VonageFrameSerializer:

Regarding splitting the PR — in this case the VonageFrameSerializer cannot be meaningfully reviewed or tested on its own. It requires the accompanying Vonage-specific WebSocket transport and the foundational example. All three pieces form a single atomic unit: a) The serializer and transport are tightly coupled because the Vonage Audio Connector expects specific binary framing, sequencing, and pacing. b) Without the transport, the serializer cannot be executed. c) Without the example, there’s no runnable validation for reviewers.
If you check out this branch, everything works end-to-end with the current serializer + transport + example. Splitting them would make the serializer untestable in isolation and make the PR harder to validate.
On the dial-in/dial-out point — Vonage’s workflow differs from Twilio/Telnyx/Plivo/Exotel, so the foundational example here is the correct equivalent for Vonage. It demonstrates the Audio Connector flow as the intended usage pattern.
Regarding FastAPIWebsocketTransport: the Vonage Audio Connector requires low-level binary frame control (opcodes, sequence numbers, 20 ms chunk pacing), which the existing transport doesn’t expose. The custom transport keeps this logic isolated without modifying core transports.

Happy to iterate further, but keeping these three components together ensures the reviewer can run and validate the integration immediately.

Additionally, today I created two PRs in the pipecat-examples repository:

https://github.com/pipecat-ai/pipecat-examples/pull/129
https://github.com/pipecat-ai/pipecat-examples/pull/130 These examples require the vonage-audio-connector dependency. The dependency itself is added in the Pipecat main repository, and this current PR defines it in the pyproject.toml, which the examples rely on.

Dec 11 '25 12:12 varunps2003

Hi @markbackman and @filipi87

I’ve rebased the feature branch onto the latest main to resolve conflicts and verify the changes against the current Pipecat codebase. I also renumbered the foundational example from 49-* to 50-*, since 49 was already in use.

To try it out, install the optional dependencies and run it the same way as other foundational examples:

uv run examples/foundational/50-vonage-audio-connector-openai.py

Please ensure the required OpenAI and Vonage environment variables are set (via .env). If running locally, you can use:

ngrok http 8005

to obtain the wss URL and set it in the Vonage-related environment variables.

Thanks for taking a look!

Dec 16 '25 15:12 varunps2003

Sorry for the delay on this review. It's been a busy week!

I kept thinking about your proposal and really wanted to avoid adding a new transport. Instead, I spent a little bit of time looking at how to implement this within the existing FastAPIWebsocketTransport constraints. Check out this PR: https://github.com/pipecat-ai/pipecat/pull/3265

It adds a new mode for handling text and binary messages to the FastAPIWebsocketTransport. It also adds a new VonageFrameSerializer.

I'd propose this: let's work on PR #3265 and get the core of this work implemented. I see you have more features for the serializer in your PR. Once 3265 is merged, you can follow up with a PR to add auto hangup and any other desired features to the serializer. Does that make sense?

Also, we don't need the foundational example. We do need a pipecat-example for this. In building this out myself, I wrote the inbound example: https://github.com/pipecat-ai/pipecat-examples/pull/133

I'd love feedback on it. Also, we'll need an outbound example, which I'm happy to have you contribute.

How does this all sound?

Dec 19 '25 03:12 markbackman