langchaingo llms: Add support for using the whisper model to transcribe audio

llms: Add support for using the whisper model to transcribe audio

Open devalexandre opened this issue 1 year ago • 3 comments

PR Checklist

[x] Read the Contributing documentation.
[x] Read the Code of conduct documentation.
[x] Name your Pull Request title clearly, concisely, and prefixed with the name of the primarily affected package you changed according to Good commit messages (such as memory: add interfaces for X, Y or util: add whizzbang helpers).
[x] Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
[x] Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. Fixes #123).
[x] Describes the source of new concepts.
[ ] References existing implementations as appropriate.
[x] Contains test coverage for new functions.
[x] Passes all golangci-lint checks.

Mar 20 '24 04:03 devalexandre

I agree with @eliben's intuition here, I'm not sure if audio transcription as a concept fits right into our llm namespace. I'm open to exposing this and generalizing over providers but I think it belongs in a different namespace.

Mar 26 '24 20:03 tmc

@tmc , @eliben

What would be the implementation idea for this functionality? maybe use openai.TranscribeAudio, leaving it only within the openai package and not in the LLM namespace?

I think in use how it, do a loader https://js.langchain.com/docs/integrations/document_loaders/file_loaders/openai_whisper_audio

Mar 27 '24 13:03 devalexandre

@tmc some update ?

Apr 23 '24 12:04 devalexandre

langchaingo langchaingo copied to clipboard

llms: Add support for using the whisper model to transcribe audio

PR Checklist

langchaingo
langchaingo copied to clipboard