mlx-audio icon indicating copy to clipboard operation
mlx-audio copied to clipboard

STS roadmap

Open Blaizzy opened this issue 11 months ago • 0 comments

The roadmap covers both approaches you mentioned:

  • End-to-End Speech-to-Speech Models: A direct approach using dedicated STS architectures like Moshi.
  • Modular Voice Pipeline: A composable approach combining Speech-to-Text, LLM processing, and Text-to-Speech.

Blaizzy avatar Mar 27 '25 23:03 Blaizzy