litellm icon indicating copy to clipboard operation
litellm copied to clipboard

[Feature]: Support for more ASR services?

Open ashutoshsaboo opened this issue 1 year ago • 4 comments

The Feature

Popular ASR services that should be included - Deepgram, AssemblyAI, Google ASR, Self hosted whisper should be supported in litellm.

Motivation, pitch

These are popular ASR systems used widely, would be good to have support for these in LiteLLM.

Twitter / LinkedIn details

No response

ashutoshsaboo avatar Jul 25 '24 08:07 ashutoshsaboo

which are you planning on using today?

krrishdholakia avatar Jul 27 '24 17:07 krrishdholakia

we are using all 4 in our production, essentially it's a fallback mechanism for a resilient system (there are several cases when one of the service throttles/fails due to whatever reason and needs a backup), in this order - self hosted whisper, followed by deepgram, and then followed by google ASR as the last in order. Assembly AI is an exception as it has excellent timestamping (probably the best out there) of all the services out there, so we use it for all our async usecases. The other 3 mentioned before it are used more for streaming usecases. @krrishdholakia

ashutoshsaboo avatar Jul 27 '24 17:07 ashutoshsaboo

interesting - why do you want this on litellm, if you already have it working?

krrishdholakia avatar Jul 27 '24 18:07 krrishdholakia

multiple reasons,

  • common api gateway interface for all ASRs, makes code cleaner?
  • no need to handle hardcoded fallbacks on clients, rather have them controlled via litellm serverside?

ashutoshsaboo avatar Jul 27 '24 18:07 ashutoshsaboo

+1 Waiting on Deepgram to be added before we can really review the idea of litellm in production.

LiamSystems avatar Sep 15 '24 17:09 LiamSystems

Hi @LiamSystems can we hop on a call to learn what you need to use LiteLLM in production ? I'd love to unblock you

my cal for your convenience: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat my linkedin if you prefer DMs: https://www.linkedin.com/in/reffajnaahsi/

ishaan-jaff avatar Sep 17 '24 03:09 ishaan-jaff

+1 for Deepgram for STT and TTS, might also want to consider ElevenLabs for TTS. Deepgram has features like diarization, entities detection, PII redaction, etc... and supports direct S3 access with resigned URLs with callback, and is extremely quick.

stephaneminisini avatar Sep 24 '24 21:09 stephaneminisini

+1 for ElevenLabs and Deepgram

mirodrr2 avatar Dec 11 '24 15:12 mirodrr2

+1 deepgram

sfarthin avatar Dec 11 '24 19:12 sfarthin

+1 deepgram and gladia

bunscc avatar Dec 28 '24 17:12 bunscc

Working on deepgram as a v0.

Deepgram seems like they have a couple different endpoints here. Would appreciate help making sure our implementation is good here.

Working on an initial version for today's release.

krrishdholakia avatar Dec 29 '24 00:12 krrishdholakia