[Feature Request] Customized Audio Transcription Provider

Open cpwan opened this issue 7 months ago • 0 comments

Current implementation: https://github.com/microsoft/markitdown/blob/62b72284feb986ffaf8c22fa73614545b5713c30/packages/markitdown/src/markitdown/converters/_transcribe_audio.py#L48

The problem is that, when my audio is not in English, it still try to transcribe to English. Is it possible to implement some configurable options to change to other provider? The SpeechRecognition library provides a couple of alternative. Technically, it should not be hard to implement... The problem is how to make the configuration intuitive and easy to understand...

May 29 '25 06:05 cpwan