Handling Mixed Chinese-English Speech Causes Hallucinations and Information Loss in Translation Mode

Open chenxu2656 opened this issue 4 months ago • 0 comments

Problem Description

When processing audio containing mixed Chinese and English content (code-switching scenarios), Whisper exhibits two major issues: In translation mode (task="translate"): The model produces severe repetitive hallucinations, generating endless repetitions of phrases like "The President The President The President..." instead of proper translations.

In transcription mode (task="transcribe"): While transcription partially works, there's significant information loss at the transition points between Chinese and English languages, leading to incomplete semantic meaning.

Anyone have this issue or have any solution ?

Nov 03 '25 10:11 chenxu2656