agents icon indicating copy to clipboard operation
agents copied to clipboard

Build real-time multimodal AI applications 🤖🎙️📹

Results 238 agents issues
Sort by recently updated
recently updated
newest added

Changes: "**This class** completely wraps ..." is referring to the class in the singular. That means we need to make these changes: 1. “abstract away” → “abstracts away” (to match...

Trying to wrap my head around multimodal agent and openai realtime API :) I want to steer the conversation by managing the system context and my intuition was that I...

When testing, I notice that sometime the agent doesn't receive the task even when the load is under the threshold. I put the log and found that the websocket didn't...

When the agent is speaking with `allow_interruption=False`, we should not be processing any user input, instead of queuing up another response (only to play it out later). That response will...

The LLM client is configured with a 5 second read timeout. If the client times out (which it does very often with a short timeout), the stream is not resumed....

I've implemented a button in the client that is supposed to ensure VAD (Voice Activity Detection) doesn't immediately commit my conversation and send it to the server. Instead, it should...

Is there a sample code or can you guide to pass additional context to llm, like in this pipeline agents example with new openai multimodal example? https://github.com/livekit/agents/blob/main/examples/voice-pipeline-agent/simple-rag/assistant.py

Livekit bring very good RTC to world with OpenSource or Cloud, Awesome! But Livekit Agent has one big problem: The Livekit' VoiceAssistant ' Pipeline are hardcoded as combining VAD+STT+LLM+TTS ,which...

I think a common use case is to toggle between voice and text mode (like in the ChatGPT app among others). If the goal is to create a multimodal framework...