pipecat icon indicating copy to clipboard operation
pipecat copied to clipboard

Resemble's Websocket TTS integration with Pipecat

Open krishvadhani19 opened this issue 8 months ago • 12 comments

Add Resemble AI TTS Integration

This PR adds a new ResembleTTSService class that integrates Resemble AI's streaming TTS API with Pipecat.

Key Features

  • Real-time streaming of TTS audio using Resemble AI's WebSocket API
  • Support for:
    • Custom voice selection via UUID
    • Configurable sample rate (default 48kHz)
    • PCM 16-bit audio format
  • Proper frame handling including:
    • TTSStartedFrame
    • TTSAudioRawFrame
    • TTSStoppedFrame
    • ErrorFrame for error cases
  • Metrics tracking (TTFB and usage metrics)
  • Clean connection handling and resource cleanup

Implementation Details

  • Uses websockets library for WebSocket communication
  • Handles base64-encoded audio content from API
  • Includes proper error handling for:
    • Connection issues
    • API errors
    • Unexpected disconnects
  • Follows Pipecat's service pattern with async generators

Usage Example

tts = ResembleTTSService(
    api_key="your_api_key",
    voice_uuid="your_voice_uuid",
    sample_rate=48000  # optional
)

krishvadhani19 avatar May 10 '25 04:05 krishvadhani19

Docs: https://docs.app.resemble.ai/docs/text_to_speech/streaming_websocket

@markbackman this is ready for approval!

krishvadhani19 avatar May 14 '25 03:05 krishvadhani19

Hey @krishvadhani19 I just set up an example with this and unfortunately, it doesn't run. Does this work for you?

Also, in comparing to the docs, I see a sample rate of 48khz is used, but that's not a supported sample rate.

UPDATE: Hmm, it looks like the websocket services requires a $699/mo business plan. Maybe that's my issue.

markbackman avatar May 16 '25 16:05 markbackman

For us to accept a submission, it would have to mimic the other TTS services. A good one to look at would be CartesiaTTSService. The only difference between Resemble.ai and Cartesia is that it appears that Resemble doesn't support word/timestamp pairs.

It might support context_ids via the request_id feature, which would be required for interruptions to work properly, so that should be implemented too. If you're interested in working on that, it would be great.

Also, you should:

  • Create an example similar to 07-interruptible.py for testing.
  • Add @traced_tts to the run_tts method to enable tracing.
  • Add any optional dependencies to pyproject.toml under the key resemble. This looks like websockets is needed.
  • Update dot-env.template with keys required
  • Add a CHANGELOG entry

markbackman avatar May 16 '25 16:05 markbackman

@krishvadhani19 can you reply, otherwise I'll close this out due to inactivity.

markbackman avatar Jun 02 '25 15:06 markbackman

hi @markbackman I will make changes to the PR. Apologies for the delay.

krishvadhani19 avatar Jun 03 '25 15:06 krishvadhani19

hi @markbackman made all above mentioned changes, ready for approval!

krishvadhani19 avatar Jun 19 '25 17:06 krishvadhani19

@krishvadhani19 does this require a Business Plan ($699/mo) to test?

From the docs, I see:

Note: Websocket API is only available for Business plan users. If you're running into trouble, upgrade to a Business plan or higher on the billing page.

markbackman avatar Jun 21 '25 14:06 markbackman

@krishvadhani19 I just removed your message as it contained a key. You might want to rotate it. If you want to share a key, that would be great, but it might be better done via a Discord DM. You can find me on the Pipecat Discord as MarkAtDaily.

markbackman avatar Jun 21 '25 22:06 markbackman

@krishvadhani19 Sorry for so many comments! There's a lot to take into account for building a TTS service.

One more question: do you have an idle timeout (e.g. disconnect the websocket after N seconds of not input)? If so, you'll need to implement a keepalive function.

I haven't tried to run the code yet, but after you clean up these comments, I'll give it a go! I'll provide more timely feedback next time :)

markbackman avatar Jul 25 '25 00:07 markbackman

This has been stale for a while - can we expect any updates soon? @krishvadhani19

haayhappen avatar Sep 05 '25 12:09 haayhappen

Hi team, Apologies for the delay, got stuck with few releases and urgent demos.

Thank you for the patience. I will get it all fixed by this week.

krishvadhani19 avatar Sep 05 '25 16:09 krishvadhani19

Very interesting! Planning to launch this week?

mmkontis avatar Sep 07 '25 23:09 mmkontis