langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Added whisper compatibility

Open vairodp opened this issue 2 years ago • 2 comments

TODO:

  • fix compatibility with base

Co-authored-by: Klein Tahiraj [email protected]

vairodp avatar Mar 03 '23 20:03 vairodp

We took the liberty to turn the banana.dev whisper example by CG80499 it into a notebook, as it was still a .py file, and put it in the right folder together with the example for our implementation

klein-t avatar Mar 05 '23 21:03 klein-t

A little list of what has been done:

  • new OpenAI Whisper audio model integration, both for transcription and translation;
  • Whisper audio model integration in audio chains;
  • rewritten AudioLoader to work with any audio_model. Adding metadata to output. Chanced filename from mp3_files.py to audio_files.py;
  • notebook on how to use Whisper audio model in AudioChain and deploy a SimpleSequentialChain with such a chain;
  • notebook on how to use the AudioLoader with Whisper as audio model;
  • notebook on how to use AudioBanana audio model in an AudioChain (you previously asked the guy taking care of the bananadev model to do this, but since I haven't seen any answer, I did it while I was also writing doc for our Whisper model in order to enforce a stile among documentation);
  • updated all needed _init_

A little note on our chaotic workflow:

As you would have notice, we made a little more that one or two commits to close this PR haha

The reson behind this is that we both had problems with poetry installation. We didn't know how long the rabbit hole of fixing problems with poetry would have taken us, so we committed as soon as we were confident enough about our work haha

To commit or not to commit? Faced with this, we choose to commit as we wanted to deliver asap, as we both see a lot of potential behind audio chains based on whisper and we both needed them for our projects.

Now we are good with poetry, so I assure only neat commits from now on haha

klein-t avatar Mar 08 '23 11:03 klein-t

why this feature not publish in master?

joqk12345 avatar Apr 10 '23 08:04 joqk12345

@joqk12345 because they are still figuring out the correct abstraction needed for audio chains

klein-t avatar Apr 25 '23 09:04 klein-t