agents icon indicating copy to clipboard operation
agents copied to clipboard

Add LMNT TTS plugin

Open zachoverflow opened this issue 1 year ago • 3 comments

I have added support for both the low-latency streaming and non-streaming versions of the LMNT API, following the existing TTS structure in adjacent plugins.

One note:

  • I have included a dependency on torchaudio. While it is a bit heavyweight, I noticed that livekit-plugins-openai also takes a dependency on it so I have done the same. (It makes implementation slightly easier.)

While I'm here, I also am fixing a minor README issue to help future folks to run the kitt demo:

  • The click configuration inside run_app attaches the LiveKit args to cli() instead of the subcommand start(). This means click expects us to put the subcommand 'start' after the LiveKit args instead of before them.

zachoverflow avatar Jan 26 '24 22:01 zachoverflow

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jan 26 '24 22:01 CLAassistant

Went ahead and fixed the sample rate when creating the audio frame to 24kHz as that's what we're always sending over, so no need to work around internally for us. Thanks!

zachoverflow avatar Jan 30 '24 06:01 zachoverflow

Hii @zachoverflow I have a few questions for the TTS api, I tried using this code for my personal use, inorder to build a plugin. But when i use and it says: LMNT API failed: 400 {"error": "Missing `text` or `voice` required arguments in request."} Code snippet:

...
url = f"{LMNT_BASE_URL}/ai/speech"
        headers = {"X-API-Key": self.api_key}
        body = {
            "text": message.text,
            "voice": self.voice_id
        }

...
async with self.session.post(url, headers=headers, json=body) as resp:
                if resp.status != 200:
                    logger.error(f"LMNT API failed: {resp.status} {await resp.text()}")
                    raise Exception(f"LMNT API returned {resp.status} status code")
                
                data = await resp.json()
                audio = base64.b64decode(data["audio"])
                chunk_queue.put_nowait(audio)
...

Any idea as to why is it happening? Do help me with it.

parshvadaftari avatar Aug 06 '24 09:08 parshvadaftari