agents icon indicating copy to clipboard operation
agents copied to clipboard

Simplismart Integration in Livekit

Open Tushar-ml opened this issue 2 months ago • 6 comments

This is an integration PR of Simplismart's open source stt, tts and LLM models

Tushar-ml avatar Dec 21 '25 14:12 Tushar-ml

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Dec 21 '25 14:12 CLAassistant

Hello and thanks for the PR! How can we test it? I checked your web app and can't find any TTS models or endpoints to test with. I've had success with public big Whisper-v3, and it works pretty fast 💨

Hormold avatar Dec 22 '25 23:12 Hormold

sure @Hormold , we are releasing it by today, and let you know

Tushar-ml avatar Dec 23 '25 04:12 Tushar-ml

Hey @Hormold, You can use this code snippet to test the integration. As @Tushar-ml said, Livekit compatible TTS models will be released on Simplismart platform today as well, if you'd prefer that.

Happy to help if you have any more questions.

Example code snippet:

from livekit.agents import (
    Agent,
    AgentSession,
    JobContext,
    RunContext,
    WorkerOptions,
    cli,
    function_tool,
)
from livekit.plugins import silero
from livekit.plugins import simplismart, openai
from dotenv import load_dotenv
import os

load_dotenv()

SIMPLISMART_API_KEY = os.getenv("SIMPLISMART_API_KEY")

@function_tool
async def lookup_weather(
    context: RunContext,
    location: str,
):
    """Used to look up weather information."""

    return {"weather": "sunny", "temperature": 70}


async def entrypoint(ctx: JobContext):
    await ctx.connect()

    agent = Agent(
        instructions="You are a friendly voice assistant built by LiveKit.",
    )
    session = AgentSession(
        vad=silero.VAD.load(),
        # any combination of STT, LLM, TTS, or realtime API can be used        
        stt=simplismart.STT(base_url="https://api.simplismart.live/predict", api_key=SIMPLISMART_API_KEY, model="openai/whisper-large-v3-turbo"),
        llm=openai.LLM(model="google/gemma-3-4b-it", api_key=SIMPLISMART_API_KEY, base_url="https://api.simplismart.live"),
        tts=simplismart.TTS(base_url="https://api.simplismart.live/tts", api_key=SIMPLISMART_API_KEY, model="Simplismart/orpheus-3b-0.1-ft"),
    )

    await session.start(agent=agent, room=ctx.room)


if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

simplipratik avatar Dec 23 '25 12:12 simplipratik

I tested it, and the voice sounds pretty smooth! The LLM isn't working for me; it's stuck with Gemma (7b and 28b) and Llama (70b). It works, but it's very, very slow.

Hormold avatar Dec 23 '25 22:12 Hormold

@Hormold Thanks for testing this out, glad to hear you liked the voice quality!

For the LLM latency, I’d recommend trying a smaller model like Gemma 3 4B (google/gemma-3-4b-it). Larger models can have higher TTFT, which is likely what you’re seeing. I’ll also dig deeper on our side to see if there’s anything we can further optimize.

In the meantime, if the STT and TTS components look good, could we go ahead and merge this PR?

Merry Christmas in advance 🎄

simplipratik avatar Dec 24 '25 10:12 simplipratik

Happy New Year folks! 🎉

Just wanted to gently check in on this PR and see if there’s any update from your side @Hormold

Happy to dive in and iterate if there’s anything that can help speed things up. (Additional tests, docs or tweaks)

dakshisdakshs avatar Jan 02 '26 07:01 dakshisdakshs

Hey @Hormold,

I've fixed the formatting issue. Let me know if anything else is needed!

simplipratik avatar Jan 07 '26 09:01 simplipratik

@coderabbitai review

tinalenguyen avatar Jan 20 '26 19:01 tinalenguyen

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai[bot] avatar Jan 20 '26 19:01 coderabbitai[bot]

📝 Walkthrough

Walkthrough

Adds a new SimpliSmart LiveKit plugin providing Speech-to-Text (HTTP + WebSocket streaming) and Text-to-Speech (HTTP streaming) integrations, package metadata, versioning, logging, and docs; includes pydantic option models and comprehensive error handling.

Changes

Cohort / File(s) Summary
Plugin infra
livekit-plugins-simplismart/livekit/plugins/simplismart/__init__.py, livekit-plugins-simplismart/livekit/plugins/simplismart/log.py, livekit-plugins-simplismart/livekit/plugins/simplismart/version.py
Adds SimplismartPlugin and registers it at import, module logger, and version = "1.3.9".
Packaging / workspace
livekit-plugins-simplismart/pyproject.toml, pyproject.toml
New project pyproject for the plugin, dependencies, build config; registers plugin in workspace sources.
Type defs
livekit-plugins-simplismart/livekit/plugins/simplismart/models.py
Adds Literal type aliases TTSModels and STTModels for supported models.
STT implementation
livekit-plugins-simplismart/livekit/plugins/simplismart/stt.py
Implements SimplismartSTTOptions (Pydantic), non-streaming HTTP recognition, streaming WebSocket SpeechStream, session management, chunking, transcript handling, and mapped API error types.
TTS implementation
livekit-plugins-simplismart/livekit/plugins/simplismart/tts.py
Implements SimplismartTTSOptions, TTS class with synthesize() returning ChunkedStream, HTTP streaming of PCM audio, and API error translation.
Documentation
livekit-plugins-simplismart/README.md
New README describing features, install, and environment setup (SIMPLISMART_API_KEY).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant STT as STT Instance
    participant SpeechStream
    participant WebSocket as WS Conn
    participant SimplismartAPI as Simplismart API
    participant Emitter as Speech Event Emitter

    Client->>STT: stream(language, conn_options)
    STT->>SpeechStream: create SpeechStream
    Client->>SpeechStream: push_frame(audio)
    SpeechStream->>WebSocket: _connect_ws()
    WebSocket->>SimplismartAPI: open WS (Auth header)
    WebSocket-->>SpeechStream: connection established
    SpeechStream->>WebSocket: send_initial_config (model, language, VAD)
    loop audio loop
      Client->>SpeechStream: push_frame(chunk)
      SpeechStream->>WebSocket: send chunk (base64)
      SimplismartAPI-->>WebSocket: transcript message
      WebSocket->>SpeechStream: on_message
      SpeechStream->>Emitter: emit SpeechEvent (FINAL_TRANSCRIPT / partial)
    end
    Client->>SpeechStream: close
    WebSocket->>SimplismartAPI: close WS
sequenceDiagram
    participant Client
    participant TTS as TTS Instance
    participant ChunkedStream
    participant HTTP as HTTP Client
    participant SimplismartAPI as Simplismart API
    participant Emitter as Audio Emitter

    Client->>TTS: synthesize(text, conn_options)
    TTS->>ChunkedStream: create ChunkedStream
    Client->>ChunkedStream: run(output_emitter)
    ChunkedStream->>HTTP: POST /tts (Auth, payload)
    HTTP->>SimplismartAPI: send request
    loop stream audio
      SimplismartAPI-->>HTTP: audio chunk (PCM)
      HTTP->>Emitter: push(audio chunk)
    end
    HTTP->>Emitter: flush() / close

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • tinalenguyen

Poem

🐰 I hopped into code with a joyful heart,
STT and TTS now ready to start.
Chunks and webs, tokens in flight,
SimpliSmart sings through day and night.
Happy hopping—transcribe and delight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.92% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Simplismart Integration in Livekit' accurately describes the main objective of the pull request, which adds comprehensive Simplismart STT, TTS, and LLM support to the LiveKit agents framework.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • [ ] 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Jan 20 '26 19:01 coderabbitai[bot]