insanely-fast-whisper Get nearly double the performance gap with another docker image on the same system.

Get nearly double the performance gap with another docker image on the same system.

Open lordofriver opened this issue 9 months ago • 1 comments

Hi, thanks for your work on replicate.com .

I'm using your docker image r8.im/vaibhavs10/incredibly-fast-whisper, and another image yoeven/insanely-fast-whisper-api from JigsawStack/insanely-fast-whisper-api (with dockerfile)

On the same RTX4090 system, with same audio, same params below

    pipe = pipeline(
        "automatic-speech-recognition",
        model="openai/whisper-large-v3",
        torch_dtype=torch.float16,
        device="cuda:0",
        model_kwargs=({"attn_implementation": "flash_attention_2"}),
    )
    generate_kwargs = {
        "task": "transcribe",
        "language": "chinese",
        "repetition_penalty": 1.25,
        }
    outputs = pipe(
        url,
        chunk_length_s=30,
        batch_size=20,
        generate_kwargs=generate_kwargs,
        return_timestamps=True,
    )

After several tests, I get quiet a different performance result (the outputs are same).

	yours	the other one
transcribe time	72s	42s
average gpu usage with gpustat	22%	40%
average gpu memory	10679M	10690M
torch version	2.0.2+cu118	2.2.0a0+81ea7a4
cuda version	cuda_11.8.r11.8	cuda_12.3.r12.3
decompressed image size	16.9G	34.7G

I tried your image because the image size difference. Wonder can we get the dockerfile to test if the performance difference is related to the cuda version (assume you were using base images from nvidia and easy to change).

May 10 '24 10:05 lordofriver

Are you running this on your local machine? I want to do the same and test it, are you able to post your docker run command or docker compose yaml to see how you're passing in your GPU etc?

Jun 05 '24 17:06 ihaddy

insanely-fast-whisper insanely-fast-whisper copied to clipboard

Get nearly double the performance gap with another docker image on the same system.

insanely-fast-whisper
insanely-fast-whisper copied to clipboard