agents
agents copied to clipboard
[Very Important for LiveKit] Request add general mechanism to customize plugins of VoiceAssistant ASAP
Livekit bring very good RTC to world with OpenSource or Cloud, Awesome! But Livekit Agent has one big problem:
The Livekit' VoiceAssistant ' Pipeline are hardcoded as combining VAD+STT+LLM+TTS ,which is pretty hard to customize it or bring a lot problems if everyone want to add extra/remove some plugin into the pipeline. The following cases are FAILED at 100%: if just need VAD+STT+LLM(remove TTS) ,VoiceAssistant may crash if remove VAD plugin(but keep others), VoiceAssistant may crash if just organized like VAD + Multimodal, VoiceAssistant may crash if want to extra process after TTS, there are no way insert a plugin into pipeline at end if want to customize chat_ctx dynamically ,it is complexity with a lot of code change
The related hard-code are here : assistant = VoiceAssistant( vad=ctx.proc.userdata["vad"], stt=deepgram.STT(), llm=openai.LLM(), tts=openai.TTS(), chat_ctx=initial_ctx, )
Result: right now Agent'Framework are good for Demo but not for product because every customer have very specific demands ,which ask general and easy way to customize flow.
Expect: VoiceAssistant should be a general pipeline framework, just manage data flow(txt,voice) between plugins and connect every plugin to finish a task. NOT depends type/purpose of plugin or how plugin work, NOT matter how plugin-inside logic
BTW: Latest version seems to be better than before by spliting VoiceAssistant to Pipeline concept ,but still hardcode inside.