WhisperLiveKit icon indicating copy to clipboard operation
WhisperLiveKit copied to clipboard

French > English translation ['] > [fr ']

Open CorentinvdBdO opened this issue 3 months ago • 2 comments

There is a common issue in French transcription into English translation:

{"speaker":3,"text":" J'ai tenté un truc.","start":"0:00:18","end":"0:00:19","translation":"Jfr 'ai tent is a trick. ","detected_language":"fr"}

Same happens with C'est > Cfr 'est

Basically an issue with ' in French text, parsed as fr ' that then breaks the translation. This might be a Non-breaking space issue, as Cool ! becomes Coolfr!.

Parameters: whisperlivekit-server --host 0.0.0.0 --model large-v3 --language fr --port 1243 --pcm-input --diarization --target-language en

CorentinvdBdO avatar Sep 26 '25 15:09 CorentinvdBdO

Hi, yeah the model struggles when it does not have enough tokens to work on (beginning of sentences). I plan to do a fix on that

QuentinFuxa avatar Oct 06 '25 17:10 QuentinFuxa

Work in progress, will be in the next release

QuentinFuxa avatar Oct 27 '25 23:10 QuentinFuxa