spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

feat: Implement ElevenLabs Text-to-Speech

Open apappascs opened this issue 8 months ago • 2 comments

This commit introduces support for ElevenLabs Text-to-Speech (TTS) service within the Spring AI framework.

Key Changes:

  • New Model Module: Added spring-ai-elevenlabs module for ElevenLabs integration.
  • Core Classes:
    • ElevenLabsTextToSpeechModel: Implements TextToSpeechModel and StreamingTextToSpeechModel for interacting with the ElevenLabs API.
    • ElevenLabsTextToSpeechOptions: Configuration options for the ElevenLabs TTS service.
    • ElevenLabsApi: Low-level client for interacting with the ElevenLabs API.
    • ElevenLabsVoicesApi: client for the elevenLabs Voices API
    • Speech, TextToSpeechMessage, TextToSpeechPrompt, TextToSpeechResponse: Data transfer objects.
  • Auto-configuration:
    • ElevenLabsAutoConfiguration: Spring Boot auto-configuration for easy setup.
    • ElevenLabsConnectionProperties: Configuration properties for ElevenLabs connection.
    • ElevenLabsSpeechProperties: Configuration properties for default TTS settings.
  • API Clients: Provides ElevenLabsApi for direct interaction with the ElevenLabs API. Also provides a ElevenLabsVoicesApi.
  • Tests: Includes comprehensive unit and integration tests.
  • Documentation: Added documentation to the Spring AI reference guide, including examples.

Functionality:

  • Text-to-Speech Conversion: Allows users to convert text input into audio using ElevenLabs' high-quality voices.
  • Streaming Support: Supports real-time audio streaming, enabling immediate playback as audio is generated.
  • Configurable Options: Provides flexible configuration options for voice selection, output format, speed, stability, and more.
  • Spring Boot Starter: Includes a Spring Boot starter (spring-ai-elevenlabs-spring-boot-starter) for simplified dependency management and auto-configuration.

Notes:

  • The classes defnined on tts package will be moved to core-package, along with any required refactoring needed to support OpenAi speech api.

Related Issue #2371

apappascs avatar Mar 02 '25 20:03 apappascs

resolves https://github.com/spring-projects/spring-ai/issues/2371

apappascs avatar Mar 03 '25 11:03 apappascs

Hi @markpollack , I'm following up to see if there's any visibility on the review timeline for this Elevenlabs PR as bandwidth allows? Would be awesome to have it

apappascs avatar Apr 18 '25 10:04 apappascs

Now that GA is past us, we can get back to this. Classes such as TextToSpeechModel now should go in the API package, but on first glance this looks great. Will test drive it, but feel free to start in the direction to merge into the current package/module structure.

markpollack avatar Jun 06 '25 13:06 markpollack

I've updated the docs a bit and added a couple tests. what a great PR! Thanks!

I've also added support in spring-ai-integration-tests repo to run these ITs.

commited in 9398850c2b54807fb7ba951c61ac559f376005d6

closing now, if there is something you want to change, just re-open and we can discuss @apappascs

markpollack avatar Jun 11 '25 19:06 markpollack