[FEATURE] Hope to add the SenseVoice Multilingual Voice Understanding Model
Description
I hope to add the SenseVoice speech recognition model. The various TTS extensions on the TEN framework are too mechanical and lack emotion. SenseVoice performs better in this regard. It is recommended to add it.
Severity
Critical
Additional Information
https://github.com/FunAudioLLM/SenseVoice
SenseVoice is a speech foundation model with multiple speech understanding capabilities, including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED).
same here
SenseVoice is an stt extension, not a tts extension.
Are you referring to the integration with the commercial SenseVoice api or the local SenseVoice model