ragflow [Feature Request]: Can ragflow add a speech-to-text model such as faster whisper? Can only fish audio be added to the text-to-speech model

Is there an existing issue for the same feature request?

[X] I have checked the existing issues.

Is your feature request related to a problem?

No response

Describe the feature you'd like

I wish I could add speech-to-text models like whisper. There are also text-to-speech models, such as cosyvoice

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

Oct 17 '24 07:10 ZxnSnowy

Do you mean this?

Oct 18 '24 01:10 KevinHuSh

你是这个意思吗？

Yes, how did you add it? Can I add a local model？

Oct 21 '24 01:10 ZxnSnowy

Add a validated openAI api key.

Oct 21 '24 04:10 KevinHuSh

Add a validated openAI api key.

Thank you for your reply. I was wondering if I could add the local model I downloaded? Examples include faster-whisper and GPT-SoVITS.

Oct 21 '24 05:10 ZxnSnowy

What about using Xinference to deploy your LLM model？

Oct 22 '24 03:10 KevinHuSh

What about using Xinference to deploy your LLM model？

@KevinHuSh Good idea. I have already used Xinference to deploy CosyVoice2-0.5B. My question is, how can we use this model? Do we need to develop the audio workflow and get the text data back from the Xinderence service?

May 06 '25 11:05 zzc-ccccc