agents Integration of Nova sonic(AWS multimodal)

Is there anything planned for release of the AWS Nova sonic multimodal?

https://docs.aws.amazon.com/nova/latest/userguide/speech.html

Apr 22 '25 20:04 kailashsp

Yes please! It would be awesome to be able to switch between Openai realtime API, Gemini live api and Amazon Nova Sonic realtime models. More context: https://aws.amazon.com/ai/generative-ai/nova/speech/

Apr 29 '25 13:04 timopetric

we are planning to add Nova as a s2s model that we'll support. the team has done some early work in the integration. It's very do-able

May 28 '25 00:05 davidzhao

we are planning to add Nova as a s2s model that we'll support. the team has done some early work in the integration. It's very do-able

that would be great

May 28 '25 09:05 dariusteep

@davidzhao Hi David, It looks like we are working on the same thing. I also have some working code that I can share. I would like to collaborate with you on this workstream. What would be the best way to contact you?

Can we get on a zoom call tormorrow? Any time between 11AM to 1PM PST works for me. Based on your availability, I will send you a Zoom invite.

May 28 '25 19:05 BumaldaOverTheWater94

@BumaldaOverTheWater94 yes! let's chat. My email is ___. I can chat on Friday, but let's coordinate over email :)

May 29 '25 07:05 davidzhao

Hey! @davidzhao Just wanted to share an update—our team also working on a LiveKit plugin for Nova Sonic. We've already built a version using an earlier release of LiveKit (LiveKit version 0.20.4 and livekit-agents version 0.12.17) that works seamlessly with Nova Sonic, and now we're developing a new version based on the latest LiveKit update. So far, we’re able to use tools and have smooth conversations, and we're currently focusing on improving how interruptions are handled during conversations.

Jun 04 '25 04:06 riyageorge1

@riyageorge1, can you share what you've developed?

Jun 04 '25 14:06 Kruhlikau

@davidzhao Could you please provide an update on the release of this plugin? It would be greatly appreciated if a beta version could be released soon, allowing us to start working on LiveKit and Nova Sonic.

Jun 09 '25 08:06 12121vishnu

Any update on the release of the plugin?

Jun 24 '25 13:06 timopetric

We have done some similar work using the AWS SDK for Swift and AvAudioEngine to implement smooth playback, interruption, echo cancellation and other features for Nova Sonic voice. Refer to the following open source code under SwiftChat App:

https://github.com/aws-samples/swift-chat/blob/main/react-native/ios/Services/AudioManager.swift

Results as follows:

https://github.com/user-attachments/assets/ebf21b12-9c93-4d2e-a109-1d6484019838

I also look forward to this feature being implemented in LiveKit, being able to implement it across multiple platforms will indeed simplify a lot of work.

Jun 26 '25 08:06 zhu-xiaowei

Hi all, Thanks for your patience. Nova Sonic plugin is available now as part of LiveKit Agents SDK v1.1.5 If you encounter any bugs, raise an issue and tag me.

Jun 30 '25 17:06 BumaldaOverTheWater94

Hi @BumaldaOverTheWater94, thanks for this integration

Could you help me with a minimal working example?

I'm trying to use the livekit.agents module as follows:

from dotenv import load_dotenv

from livekit import agents
from livekit.agents import Agent, AgentSession, RoomInputOptions
from livekit.plugins import aws, noise_cancellation

load_dotenv(override=True)

class NovaSonicAgent(Agent):
    def __init__(self):
        super().__init__(
            instructions="You are a helpful voice assistant.",
        )

async def entrypoint(ctx: agents.JobContext):
    session = AgentSession(llm=aws.realtime.realtime_model.RealtimeModel())

    await session.start(
        room=ctx.room,
        agent=NovaSonicAgent(),
        room_input_options=RoomInputOptions(
            noise_cancellation=noise_cancellation.BVC(),
        ),
    )

    await ctx.connect()

if __name__ == "__main__":
    agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))

However, when I try to run it, I get the following error:

AttributeError: 'IndexError' object has no attribute 'message'

Let me know if I’m missing something in the setup or if this might be a bug. I’d really appreciate any guidance

That are my lib versions:

    "livekit>=1.0.11",
    "livekit-agents[google,openai,tavus]>=1.1.5",
    "livekit-api>=1.0.3",
    "livekit-plugins-aws[realtime]>=1.1.5",
    "livekit-plugins-noise-cancellation~=0.2",

Edit:

The error start on:

                # note: user ASR text is slightly different than what is sent to LiveKit (newline vs whitespace)  # noqa: E501
                # TODO: fix this
                self._update_chat_ctx(role="user", text_content=text_content)

Edit 2:

Working adding the following code to _update_chat_ctx function (aws.experimental.realtime.realtime_model.py RealSession._update_chat_ctx)

        prev_utterance = self._chat_ctx.items[-1] if self._chat_ctx.items else None

        if not prev_utterance or not prev_utterance.content:
            # no previous utterance, so just add the new one
            self._chat_ctx.add_message(role=role, content=text_content)
            if len(self._chat_ctx.items) > MAX_MESSAGES:
                self._chat_ctx.truncate(max_items=MAX_MESSAGES)
            return

Jul 01 '25 19:07 MatheusRDG

Hi @MatheusRDG Thanks for trying out the plugin. Actually this is a bug that I noticed right after the v1.1.15 release. If you look at the latest code (https://github.com/livekit/agents/blob/main/livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/experimental/realtime/realtime_model.py#L664) and run from there, that should handle the issue.

Use something like uv pip install -e . to link the source code as a dependency to your working project.

Jul 01 '25 23:07 BumaldaOverTheWater94

one other option is to add some messages to the chat_ctx so you don't encounter the index OOB error

class Assistant(Agent):
    def __init__(self, tools: list[llm.FunctionTool | llm.RawFunctionTool]) -> None:
        chat_ctx = ChatContext.empty()
        chat_ctx.add_message(role="user", content="hey sonic, tell me a children's story")
        chat_ctx.add_message(role="assistant", content=story)

        super().__init__(
            instructions="You are a helpful voice AI assistant.",
            tools=tools,
            chat_ctx=chat_ctx,
        )

Jul 01 '25 23:07 BumaldaOverTheWater94

@BumaldaOverTheWater94 Both options are working for me. Really appreciate your work, thank you!

Jul 01 '25 23:07 MatheusRDG

Does Nova-sonic support function tooling? Does it support RunContext?

Jul 06 '25 19:07 mridulrao

Does Nova-sonic support function tooling? Does it support RunContext?

Yes, function tool calling is supported. See https://github.com/livekit/agents/pull/2817 for an example.

Unfortunately RunContext is not currently supported as the RealtimeSession does not have knowledge of the outside AgentSession to inject RunContext. Adding support for RunContext is WIP.

Jul 08 '25 19:07 BumaldaOverTheWater94

@mridulrao RunContext support has been added with v1.1.16

Jul 10 '25 18:07 BumaldaOverTheWater94

marking this issue as completed

Jul 22 '25 17:07 BumaldaOverTheWater94