pipecat icon indicating copy to clipboard operation
pipecat copied to clipboard

Add Respeecher Space TTS service

Open Kharacternyk opened this issue 4 months ago • 6 comments

Our docs: https://space.respeecher.com/docs We also support non-WebSocket endpoints, but it's not really clear if it's a common use case for Pipecat users, so we would appreciate some clarification whether it's worth adding an HTTP version of the service.

Docs PR: https://github.com/pipecat-ai/docs/pull/355

Kharacternyk avatar Sep 12 '25 19:09 Kharacternyk

Hello! Sorry for the delayed response. We've just rolled out Community Integrations and we'd like to invite you to submit your integration for listing.

While your integration won't be part of Pipecat's core code, it will be discoverable in the Pipecat docs and fully usable with Pipecat.

Please review the guidelines here and let me know if you have any questions: https://github.com/pipecat-ai/pipecat/blob/main/COMMUNITY_INTEGRATIONS.md

markbackman avatar Oct 03 '25 21:10 markbackman

Hey @markbackman, this could work for us, but there's a problem that we support contexts and don't support word timestamps, so our ideal base class would be AudioContextTTSService, which Pipecat lacks. If we can't upstream AudioContextTTSService, we would need to maintain a lot of duplicated code for context management that could frequently get out-of-sync with the version in AudioContextWordTTSService

Kharacternyk avatar Oct 06 '25 08:10 Kharacternyk

Hi @Kharacternyk, the AudioContextTTSService class handles a websocket case where there can be multiple streams returned at the same time. This is unique to only Websocket services.

If your service has an HTTP API, you should manage the context internal to the service class. In fact, the ElevenLabsHttpService has a feature for context, which is managed internally to the class. It uses the TTSService base class. So, I think this should be possible.

Please reach out with any specific questions that you may have.

markbackman avatar Oct 15 '25 15:10 markbackman

Hi @markbackman, thanks for the reply. Our service is a Websocket service. The problem is that there's no AudioContextTTSService in Pipecat (this PR adds one), there's only AudioContext**Word**TTSService. It's not really obvious why these two separate features (contexts and word timestamps) are coupled, and it's a problem for us because we have contexts but currently lack word timestamps in our Websocket service. Is it possible to consider merging at least AudioContextTTSService from this PR so that our and other people's Community Integrations can use it?

Kharacternyk avatar Oct 15 '25 17:10 Kharacternyk

Got it. Yes, the base classes for the TTS services are complex. There are so many permutations of how services are built.

Tagging @aconchillo for thoughts on how to proceed.

markbackman avatar Oct 15 '25 17:10 markbackman

Hey @aconchillo, have you had some time to take a look?

Kharacternyk avatar Nov 20 '25 18:11 Kharacternyk