agents icon indicating copy to clipboard operation
agents copied to clipboard

Feature Request: Ultravox example

Open thiswillbeyourgithub opened this issue 11 months ago • 9 comments

Hi,

Ultravox is a suite of open weight models that are designed for getting the time to first token as low as possible with audio input. Basically they trained a good and fast projector to project the whisper large v3 encoder into llama 4.1 LLMs, both in 8B and 70B size.

I think it would be a great fit for livekit's agents so it would be nice to add an example and demo for it!

thiswillbeyourgithub avatar Jan 15 '25 22:01 thiswillbeyourgithub

Thanks -- happy to work with folks at Livekit to make this happen!

zkoch avatar Jan 25 '25 21:01 zkoch

Any update on this?

therealron avatar Mar 02 '25 17:03 therealron

Hello Everyone, any update on this?

psinojiya avatar Mar 19 '25 12:03 psinojiya

I'm very interested still, especially as nothing as simple as ollama exists for chat (audio+text) but I lack the skills to implement it. I'm still surprised no one has created it since it seems to be in the best interest of all parties involved: livekit, ultravox, kyutai (they made moshi), etc. And everyone seems to advertise their solution as easy to implement.

thiswillbeyourgithub avatar Mar 19 '25 12:03 thiswillbeyourgithub

Hi It will be very easy to implement with our next big release.

jayeshp19 avatar Mar 19 '25 14:03 jayeshp19

Sounds great @jayeshp19 , any approximate idea on dates for that?

psinojiya avatar Mar 19 '25 14:03 psinojiya

In the next week or two, you can track progress here: https://github.com/livekit/agents/pull/1364

jayeshp19 avatar Mar 19 '25 14:03 jayeshp19

@jayeshp19 Any current updates on this? I am not seeing anything related to this in the link provided. Thanks!

mdwoicke avatar Apr 03 '25 01:04 mdwoicke

I'm looking to use ultravox and livekit would love to know where this project stands.

NelsonHotel avatar May 09 '25 05:05 NelsonHotel

There isn't any working example yet for the community ?

akarray avatar May 25 '25 10:05 akarray

I am working on a PR for this: https://github.com/livekit/agents/pull/2409

[!WARNING] This is only for Ultravox's paid API service, not the model. If you want to use the model, you will have to host and manage it somewhere yourself. There are many similar model related issues here (#2262, #1724, #962 #1687) but unfortunately, I think that's out of the current scope.

ChenghaoMou avatar May 27 '25 11:05 ChenghaoMou

Any update on this? Would really like ultravox work with livekit

nischalj10 avatar Jun 28 '25 09:06 nischalj10

Is any update on the livekit support the ultravox

thevijaydeore avatar Jul 05 '25 13:07 thevijaydeore

+1

tongclement avatar Jul 22 '25 16:07 tongclement

hey @jayeshp19 - just following up on this

nischalj10 avatar Jul 22 '25 17:07 nischalj10

Would be great if it would support fine-tuned versions of the model too (fine tuned using custom data to support new languages like cantonese) (still investigating how this would work on a technical level)

tongclement avatar Jul 23 '25 07:07 tongclement

Any update ?

mercuryyy avatar Aug 14 '25 07:08 mercuryyy

Any update

attiquers avatar Aug 31 '25 01:08 attiquers