faster-whisper
faster-whisper copied to clipboard
The WhisperModel function num_workers How to make the execution truly parallel
The WhisperModel function has a parameter num_workers. If I only have one GPU and multiple threads call the transcribe function, the test result is serial. Two threads processing two files takes twice as long as one thread processing one file. How to make the execution truly parallel?
@fanqiangwei , hello. I think that using multi workers on a single GPU will not increase the throughput, you should use multi GPU with option device_index. For more information, see this issue: https://github.com/SYSTRAN/faster-whisper/issues/100