stdin support
cat sound.wav | whisper-ctranslate2
Changed src/whisper-ctranslate2.py. Edits the check for no audio cli argument and no live transcribe.. If there is stdin data then a new temporary file is created and written to with the stdin data, then updates the audio ( cli argument ) to be the newly created temporary file.
The purpose of this is to allow foreign programs written in other languages to pipe in data to whisper-ctranslate2.
Tomorrow we will have new release for this gpu issue.
Great! Looking forward to it!
Could you please test debug image to verify GPU issue is gone?
docker run -d --gpus all -p 9000:9000 -e ASR_MODEL=base onerahmet/openai-whisper-asr-webservice:debug-gpu
@ahmetoner - I am having the same GPU issue. I've tried the debug-gpu, v1.1.0-gpu, and latest-gpu image tags but all files are still transcribed using CPUs.
In the screenshot you can see I run nvidia-smi and my four GPUs are available to the container image, however, from the activated venv torch does not recognize the GPUs.
Sorry, wasnt able to test until now (had to do a board exam). It's still using the CPU. Here are the logs from the docker container:
[2023-06-03 15:33:39 +0000] [7] [INFO] Starting gunicorn 20.1.0
[2023-06-03 15:33:39 +0000] [7] [INFO] Listening at: http://0.0.0.0:9000 (7)
[2023-06-03 15:33:39 +0000] [7] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2023-06-03 15:33:39 +0000] [8] [INFO] Booting worker with pid: 8
/app/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py:88: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
0%| | 0.00/461M [00:00<?, ?iB/s]
1%|▎ | 4.16M/461M [00:00<00:10, 43.7MiB/s]
2%|▋ | 8.33M/461M [00:00<00:11, 42.1MiB/s]
3%|█ | 12.7M/461M [00:00<00:10, 43.9MiB/s]
4%|█▍ | 16.9M/461M [00:00<00:10, 43.7MiB/s]
5%|█▋ | 21.1M/461M [00:00<00:10, 43.5MiB/s]
5%|██ | 25.3M/461M [00:00<00:11, 40.0MiB/s]
6%|██▍ | 29.1M/461M [00:00<00:12, 37.3MiB/s]
7%|██▋ | 32.7M/461M [00:00<00:13, 34.3MiB/s]
8%|██▉ | 36.1M/461M [00:00<00:12, 34.5MiB/s]
9%|███▎ | 39.8M/461M [00:01<00:12, 35.8MiB/s]
9%|███▌ | 43.8M/461M [00:01<00:11, 37.4MiB/s]
10%|███▉ | 47.7M/461M [00:01<00:11, 38.4MiB/s]
11%|████▏ | 51.5M/461M [00:01<00:11, 38.9MiB/s]
12%|████▌ | 55.4M/461M [00:01<00:10, 39.3MiB/s]
13%|████▊ | 59.2M/461M [00:01<00:12, 34.8MiB/s]
14%|█████▏ | 63.0M/461M [00:01<00:11, 36.2MiB/s]
15%|█████▌ | 67.6M/461M [00:01<00:10, 39.7MiB/s]
16%|█████▉ | 72.2M/461M [00:01<00:09, 42.0MiB/s]
17%|██████▎ | 76.3M/461M [00:02<00:10, 38.8MiB/s]
17%|██████▌ | 80.1M/461M [00:02<00:10, 38.7MiB/s]
18%|██████▉ | 83.8M/461M [00:02<00:11, 35.2MiB/s]
19%|███████▏ | 87.3M/461M [00:02<00:11, 35.4MiB/s]
20%|███████▍ | 90.7M/461M [00:02<00:12, 32.0MiB/s]
20%|███████▊ | 94.4M/461M [00:02<00:11, 33.5MiB/s]
21%|████████ | 98.5M/461M [00:02<00:10, 36.2MiB/s]
22%|████████▋ | 102M/461M [00:02<00:10, 36.9MiB/s]
23%|████████▉ | 106M/461M [00:02<00:10, 34.7MiB/s]
24%|█████████▏ | 109M/461M [00:03<00:10, 35.1MiB/s]
24%|█████████▌ | 113M/461M [00:03<00:10, 33.5MiB/s]
25%|█████████▉ | 117M/461M [00:03<00:09, 36.4MiB/s]
26%|██████████▏ | 121M/461M [00:03<00:09, 38.6MiB/s]
27%|██████████▌ | 125M/461M [00:03<00:09, 39.0MiB/s]
28%|██████████▉ | 129M/461M [00:03<00:08, 39.4MiB/s]
29%|███████████▏ | 133M/461M [00:03<00:08, 39.2MiB/s]
30%|███████████▌ | 136M/461M [00:03<00:08, 38.8MiB/s]
30%|███████████▊ | 140M/461M [00:03<00:09, 36.8MiB/s]
31%|████████████▏ | 144M/461M [00:04<00:08, 38.7MiB/s]
32%|████████████▌ | 149M/461M [00:04<00:08, 40.8MiB/s]
33%|████████████▉ | 153M/461M [00:04<00:08, 38.0MiB/s]
34%|█████████████▏ | 156M/461M [00:04<00:08, 36.2MiB/s]
35%|█████████████▌ | 160M/461M [00:04<00:08, 38.0MiB/s]
36%|█████████████▉ | 164M/461M [00:04<00:08, 35.9MiB/s]
36%|██████████████▏ | 168M/461M [00:04<00:08, 36.8MiB/s]
37%|██████████████▌ | 172M/461M [00:04<00:07, 38.2MiB/s]
38%|██████████████▊ | 176M/461M [00:04<00:07, 39.0MiB/s]
39%|███████████████▏ | 179M/461M [00:05<00:07, 38.4MiB/s]
40%|███████████████▍ | 183M/461M [00:05<00:08, 36.4MiB/s]
41%|███████████████▊ | 187M/461M [00:05<00:07, 37.6MiB/s]
41%|████████████████ | 191M/461M [00:05<00:07, 35.6MiB/s]
42%|████████████████▍ | 194M/461M [00:05<00:07, 35.8MiB/s]
43%|████████████████▋ | 198M/461M [00:05<00:08, 34.1MiB/s]
44%|█████████████████ | 201M/461M [00:05<00:07, 34.8MiB/s]
45%|█████████████████▎ | 205M/461M [00:05<00:07, 37.4MiB/s]
45%|█████████████████▋ | 210M/461M [00:05<00:06, 39.5MiB/s]
46%|██████████████████ | 213M/461M [00:05<00:07, 36.9MiB/s]
47%|██████████████████▍ | 217M/461M [00:06<00:06, 38.4MiB/s]
48%|██████████████████▋ | 221M/461M [00:06<00:07, 35.9MiB/s]
49%|██████████████████▉ | 225M/461M [00:06<00:07, 34.5MiB/s]
49%|███████████████████▎ | 228M/461M [00:06<00:07, 32.3MiB/s]
50%|███████████████████▌ | 231M/461M [00:06<00:07, 30.2MiB/s]
51%|███████████████████▉ | 235M/461M [00:06<00:07, 33.8MiB/s]
52%|████████████████████▏ | 239M/461M [00:06<00:07, 32.7MiB/s]
53%|████████████████████▍ | 242M/461M [00:06<00:06, 34.5MiB/s]
53%|████████████████████▊ | 246M/461M [00:06<00:06, 36.4MiB/s]
54%|█████████████████████ | 250M/461M [00:07<00:06, 33.0MiB/s]
55%|█████████████████████▍ | 254M/461M [00:07<00:06, 36.1MiB/s]
56%|█████████████████████▊ | 259M/461M [00:07<00:05, 39.1MiB/s]
57%|██████████████████████▏ | 263M/461M [00:07<00:05, 41.0MiB/s]
58%|██████████████████████▌ | 267M/461M [00:07<00:04, 42.8MiB/s]
59%|██████████████████████▉ | 272M/461M [00:07<00:04, 43.3MiB/s]
60%|███████████████████████▎ | 276M/461M [00:07<00:04, 42.3MiB/s]
61%|███████████████████████▋ | 280M/461M [00:07<00:04, 41.2MiB/s]
62%|████████████████████████ | 284M/461M [00:07<00:04, 39.8MiB/s]
62%|████████████████████████▎ | 288M/461M [00:08<00:04, 38.7MiB/s]
63%|████████████████████████▋ | 291M/461M [00:08<00:04, 39.0MiB/s]
64%|████████████████████████▉ | 295M/461M [00:08<00:04, 38.4MiB/s]
65%|█████████████████████████▎ | 299M/461M [00:08<00:04, 38.9MiB/s]
66%|█████████████████████████▋ | 303M/461M [00:08<00:04, 40.3MiB/s]
67%|█████████████████████████▉ | 307M/461M [00:08<00:04, 35.5MiB/s]
67%|██████████████████████████▎ | 311M/461M [00:08<00:04, 35.4MiB/s]
68%|██████████████████████████▌ | 315M/461M [00:08<00:04, 37.2MiB/s]
69%|██████████████████████████▉ | 319M/461M [00:08<00:03, 38.8MiB/s]
70%|███████████████████████████▎ | 322M/461M [00:09<00:04, 34.4MiB/s]
71%|███████████████████████████▌ | 326M/461M [00:09<00:04, 35.0MiB/s]
71%|███████████████████████████▊ | 329M/461M [00:09<00:03, 34.8MiB/s]
72%|████████████████████████████▏ | 333M/461M [00:09<00:03, 35.1MiB/s]
73%|████████████████████████████▍ | 337M/461M [00:09<00:03, 35.9MiB/s]
74%|████████████████████████████▊ | 340M/461M [00:09<00:03, 36.0MiB/s]
75%|█████████████████████████████ | 344M/461M [00:09<00:03, 38.2MiB/s]
76%|█████████████████████████████▍ | 348M/461M [00:09<00:03, 39.1MiB/s]
76%|█████████████████████████████▊ | 352M/461M [00:09<00:02, 40.1MiB/s]
77%|██████████████████████████████ | 356M/461M [00:10<00:02, 39.7MiB/s]
78%|██████████████████████████████▍ | 360M/461M [00:10<00:02, 39.3MiB/s]
79%|██████████████████████████████▊ | 364M/461M [00:10<00:02, 36.9MiB/s]
80%|███████████████████████████████ | 367M/461M [00:10<00:02, 37.1MiB/s]
80%|███████████████████████████████▍ | 371M/461M [00:10<00:02, 37.6MiB/s]
81%|███████████████████████████████▋ | 375M/461M [00:10<00:02, 38.9MiB/s]
82%|████████████████████████████████ | 379M/461M [00:10<00:02, 35.2MiB/s]
83%|████████████████████████████████▎ | 383M/461M [00:10<00:02, 36.8MiB/s]
84%|████████████████████████████████▋ | 387M/461M [00:10<00:01, 39.0MiB/s]
85%|█████████████████████████████████ | 391M/461M [00:10<00:01, 40.2MiB/s]
86%|█████████████████████████████████▍ | 395M/461M [00:11<00:01, 41.4MiB/s]
87%|█████████████████████████████████▊ | 400M/461M [00:11<00:01, 42.8MiB/s]
88%|██████████████████████████████████▏ | 404M/461M [00:11<00:01, 42.8MiB/s]
89%|██████████████████████████████████▌ | 408M/461M [00:11<00:01, 43.7MiB/s]
89%|██████████████████████████████████▉ | 412M/461M [00:11<00:01, 41.5MiB/s]
90%|███████████████████████████████████▎ | 417M/461M [00:11<00:01, 42.8MiB/s]
91%|███████████████████████████████████▋ | 421M/461M [00:11<00:00, 43.8MiB/s]
92%|████████████████████████████████████ | 426M/461M [00:11<00:00, 45.4MiB/s]
93%|████████████████████████████████████▍ | 430M/461M [00:11<00:00, 43.8MiB/s]
94%|████████████████████████████████████▋ | 435M/461M [00:12<00:00, 37.7MiB/s]
95%|█████████████████████████████████████ | 438M/461M [00:12<00:00, 37.9MiB/s]
96%|█████████████████████████████████████▍ | 442M/461M [00:12<00:00, 39.2MiB/s]
97%|█████████████████████████████████████▋ | 446M/461M [00:12<00:00, 35.5MiB/s]
98%|██████████████████████████████████████ | 450M/461M [00:12<00:00, 36.1MiB/s]
98%|██████████████████████████████████████▎| 454M/461M [00:12<00:00, 37.0MiB/s]
99%|██████████████████████████████████████▋| 458M/461M [00:12<00:00, 38.5MiB/s]
100%|███████████████████████████████████████| 461M/461M [00:12<00:00, 37.5MiB/s]
[2023-06-03 15:33:56 +0000] [8] [INFO] Started server process [8]
[2023-06-03 15:33:56 +0000] [8] [INFO] Waiting for application startup.
[2023-06-03 15:33:56 +0000] [8] [INFO] Application startup complete.
/app/.venv/lib/python3.10/site-packages/whisper/transcribe.py:78: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
I managed to get it working be adding "--runtime nvidia" and "-e NVIDIA_VISIBLE_DEVICES=all"
Which image did you use, @qjao ?
Thanks, but no luck.
@ahmetoner could you share which CUDA chips you've tested this build with? I'm wondering if I would need to build with a different cuda version. I'm on a machine with an NVIDIA Titan RTX card and a GeForce GTX 1070.
Having the same issue with Cuda 12.2. Can't seem to get this to use the GPU. Very cool project. CPU seems to work, though I have noticed a lot of words don't get translated correctly, and sentences between people all seem to run together. I'll test the debug image and see if that helps
Having the same issue with Cuda 12.2. Can't seem to get this to use the GPU. Very cool project. CPU seems to work, though I have noticed a lot of words don't get translated correctly, and sentences between people all seem to run together. I'll test the debug image and see if that helps
Have you had success getting GPU to work with the regular openai/whisper repo?
Thanks, but no luck.
@ahmetoner could you share which CUDA chips you've tested this build with? I'm wondering if I would need to build with a different cuda version. I'm on a machine with an NVIDIA Titan RTX card and a GeForce GTX 1070.
What driver are you on? It's probably due to incompatibility between CUDA and nvidia driver versions. Please refer to this compatibility matrix and make sure your driver is supported. The latest version of this project uses CUDA 11.8. Your driver could be too old or too new to use it.
It works for me using a GeForce GTX 1650 on 525.125.06. Thank you.
I could use the GPUs with the v1.2.0-gpu model. However, it always chose GPU number 0, irrespective of what I tried to specify with --gpus. I tried all, 7, all with -e "NVIDIA_VISIBLE_DEVICES=7", but it was always GPU 0, not 7.
I could use the GPUs with the v1.2.0-gpu model. However, it always chose GPU number 0, irrespective of what I tried to specify with
--gpus. I triedall,7,allwith -e "NVIDIA_VISIBLE_DEVICES=7", but it was always GPU 0, not 7.
Did you get this to work? Sounds like it could be an issue with Docker instead of the ASR container.
I could use the GPUs with the v1.2.0-gpu model. However, it always chose GPU number 0, irrespective of what I tried to specify with
--gpus. I triedall,7,allwith -e "NVIDIA_VISIBLE_DEVICES=7", but it was always GPU 0, not 7.Did you get this to work? Sounds like it could be an issue with Docker instead of the ASR container.
Yes, turns out you have to write it in the format of --gpus \"device=5\". It would be great if the example on the main page would include such an example, because it is not as trivial as --gpus all.
FYI for Docker Compose users, just add runtime: nvidia to the service's compose file let it use GPUs.
FYI for Docker Compose users, just add
runtime: nvidiato the service's compose file let it use GPUs.
This didn't fix the problem on my machine. Running a P2000, Nvidia Driver Version: 550.127.05, CUDA Version: 12.4