PeerTube icon indicating copy to clipboard operation
PeerTube copied to clipboard

Transcription : use video language if provided instead of auto detection

Open schmaker opened this issue 2 months ago • 0 comments

Describe the current behavior

Misdetection on transcription language of this video https://vhsky.cz/w/ajtLAaxCKVH5YgyRJouzYH - instead of transcripting to English it tries to translate into Czech.

Research we did so far:

  1. Check video language in Peertube - correct Image
  2. Try to re-generate transcription in case there was wrong language set on upload - still translating to Czech
  3. Check various stuff on instance if it's not forcing Czech language somehow - did not find anything, even main instance language is set to English Image

It seems to me that WhisperAI is guessing language and is not success with it.

Steps to reproduce

  1. Try to generate English captions to this video
  2. Captions are "somehow" translated into Czech instead of just transcribing

Describe the expected behavior

WhisperAI should take language set in PeerTube video details if available before trying to "guess" the language.

Additional information

  • PeerTube instance:
    • URL: https://vhsky.cz
    • Version: 7.3.0
    • Transcription engine: CTranslate2
    • Model: Large-v3

schmaker avatar Nov 10 '25 19:11 schmaker