whisperX
whisperX copied to clipboard
Segments is too long compare to faster-whisper in the large-v2
hi,
I run into two problems when using whisperX.
-
I found the whipserX (model.transcribe) generates long segments pretty larger than faster-whipser. Is there any parameter can control the segments length?
-
The first 5 seconds texts were missed in whisperX.
Thanks a lot.
Below is my code:
`device = "cuda" audio_file = "audio.mp3" batch_size = 16 # reduce if low on GPU mem compute_type = "float16" # change to "int8" if low on GPU mem (may reduce accuracy)
model = whisperx.load_model("large-v2", device, compute_type=compute_type)
audio = whisperx.load_audio(audio_file)
result = model.transcribe(audio, batch_size=batch_size) t3 = time.time_ns() / 1000000 print(result["segments"]) # before alignment`
similar problem. HAVE you solved it?