faster-whisper Gibberish Outputs

Gibberish Outputs

Open RohitMidha23 opened this issue 1 year ago • 3 comments

On translating a fine-tuned model from Huggingface Whisper to ctranslate2 and running with faster whisper, i get extremely gibberish output.

I've tried it with various different versions but the output contains a lot of periods and dashes that doesn't make too much sense.

The same audios, when passed to the normal model perform exceptionally well and hence the question..

I am currently translating the model with ctranslate2 = v4.1.0 and faster-whisper = v1.0.1.

@trungkienbkhn can you please help?

May 08 '24 21:05 RohitMidha23

@RohitMidha23 , hello. Which HF model did you use to convert to ctranslate2 format ? And could you show your convertion command ?

May 09 '24 03:05 trungkienbkhn

@trungkienbkhn it is a finetuned model on whisper-large-v2. The command I used is:

ct2-transformers-converter --model "model_path" \
--output_dir "output_model_path" \
--copy_files tokenizer_config.json preprocessor_config.json special_tokens_map.json generation_config.json \
 --quantization float16

May 09 '24 05:05 RohitMidha23

@RohitMidha23 In fact, there are also a few models after conversion whose quality is not as good as the previous model. You can try to remove option --quantization float16 in conversion command. Or a second way, add option condition_on_previous_text=False when transcribing. We had same issue with distil-large-v2 model conversion, you can refer to this comment.

May 09 '24 07:05 trungkienbkhn

faster-whisper faster-whisper copied to clipboard

Gibberish Outputs

faster-whisper
faster-whisper copied to clipboard