ai icon indicating copy to clipboard operation
ai copied to clipboard

Experimental speech streaming for LMNT (useChat/useCompletion React)

Open lgrammel opened this issue 5 months ago • 14 comments

Summary

Adds speech streaming to useChat and useCompletion with streamData.

  • useCompletion & useChat (for React) provide a experimental_speechUrl that can be used html audio elements
  • Integration functions for lmnt speech streams through experimental_forwardLmntSpeechStream
  • streamData.experimental_appendSpeech: add speech stream chunks to data stream (used automatically through forward functions)
  • Example: examples/next-lmnt: LMNT completion & chat speech streaming
  • Docs: LMNT provider docs, API docs for experimental_forwardLmntSpeechStream

Notes

  • The LMNT SDK does not work in the edge environment (as of v1.1.2)

lgrammel avatar Jan 17 '24 15:01 lgrammel

This is exciting

untilhamza avatar Jan 18 '24 02:01 untilhamza

This is awesome!

llermaly avatar Jan 24 '24 02:01 llermaly

@lgrammel Hi Lars, I tried to test this one locally with no luck, it is showing this error:

 ⚠ ./app/api/chat-speech-elevenlabs/route.ts
Attempted import error: 'forwardModelFusionSpeechStream' is not exported from 'ai' (imported as 'forwardModelFusionSpeechStream').

I go to node_modules/ai and I see the function there, not sure if I need to do anything else. (I cloned the fork, checkout to the branch and run the example)

It is ready to test?

Thanks!

llermaly avatar Jan 26 '24 01:01 llermaly

@lgrammel Hi Lars, I tried to test this one locally with no luck, it is showing this error:

 ⚠ ./app/api/chat-speech-elevenlabs/route.ts
Attempted import error: 'forwardModelFusionSpeechStream' is not exported from 'ai' (imported as 'forwardModelFusionSpeechStream').

I go to node_modules/ai and I see the function there, not sure if I need to do anything else. (I cloned the fork, checkout to the branch and run the example)

It is ready to test?

Thanks!

Have you rebuilt the ai package? The easiest way is to just rebuild the whole repository (pnpm i, pnpm build) and then try out the example.

lgrammel avatar Jan 26 '24 09:01 lgrammel

@lgrammel Hi Lars, I tried to test this one locally with no luck, it is showing this error:

 ⚠ ./app/api/chat-speech-elevenlabs/route.ts
Attempted import error: 'forwardModelFusionSpeechStream' is not exported from 'ai' (imported as 'forwardModelFusionSpeechStream').

I go to node_modules/ai and I see the function there, not sure if I need to do anything else. (I cloned the fork, checkout to the branch and run the example) It is ready to test? Thanks!

Have you rebuilt the ai package? The easiest way is to just rebuild the whole repository (pnpm i, pnpm build) and then try out the example.

That did the trick thank you!. I was doing npm run dev , I did pnpm build , npm start and it worked.

It works really, really fast. I hope we can get this merged very soon.

llermaly avatar Jan 27 '24 15:01 llermaly

Hi @MaxLeiter! did you have a chance to take a look?

llermaly avatar Jan 30 '24 14:01 llermaly

Hello @lgrammel I saw that you changed from eleven labs to LMNT, there is a technical reason for this, eleven labs supports multi languages, LMNT still has no plans to launch this, wouldn't it be interesting to keep both options?

Thank you and congratulations for the excellent work

tgonzales avatar Feb 22 '24 20:02 tgonzales

Hello @lgrammel I saw that you changed from eleven labs to LMNT, there is a technical reason for this, eleven labs supports multi languages, LMNT still has no plans to launch this, wouldn't it be interesting to keep both options?

Thank you and congratulations for the excellent work

Thanks. We want to use the official elevenlabs node SDK, but it does not support duplex streaming yet: https://github.com/elevenlabs/elevenlabs-js/issues/4

In the meantime, you could use modelfusion elevenlabs with the adapter that I had in an earlier version of this PR.

lgrammel avatar Feb 23 '24 09:02 lgrammel

@lgrammel Hi! I can't find the example app for speech streaming in the Vercel AI SDK repo. where it's gone?

Iven2132 avatar Mar 05 '24 09:03 Iven2132

@lgrammel Hi! I can't find the example app for speech streaming in the Vercel AI SDK repo. where it's gone?

this feature has not been merged yet

lgrammel avatar Mar 05 '24 09:03 lgrammel

Hi @MaxLeiter Can you merge this?

Iven2132 avatar Mar 05 '24 10:03 Iven2132

Hi @MaxLeiter Can you please approve this?

Iven2132 avatar Mar 09 '24 08:03 Iven2132

bump

llermaly avatar Mar 09 '24 18:03 llermaly

we could really use this as well 🙏 thank you so much for the work on this

pixelcatgg avatar Mar 10 '24 16:03 pixelcatgg

Any reason why this is closed? TTS is a great feature to have

allenchuang avatar May 01 '24 19:05 allenchuang

Would be cool to see TTS added with the addition of gpt-4o

solanacryptodev avatar May 14 '24 23:05 solanacryptodev

@lgrammel / @MaxLeiter Any follow up plans on adding TTS to vercel AI ?

alokwhitewolf avatar May 20 '24 05:05 alokwhitewolf