Joseph Liba comments

Results 10 comments of


                                            Joseph Liba

Added multiprocessing for cpu processing

Yes! I can send my data and test case later today. On Fri, Jan 19, 2024 at 1:15 AM Purfview ***@***.***> wrote: > Does this have any actual impact on...

Added multiprocessing for cpu processing

> Does this have any actual impact on performance? Do you have benchmarks? Testing code: `from faster_whisper import WhisperModel, decode_audio from io import BytesIO import time from fastapi import FastAPI,...

Added multiprocessing for cpu processing

Nice! That's not a bad idea. Please don't merge this in for now. I noticed that there's memory inefficiency, and the pool size needs to be capped or have a...

Strange behavior for num_workers and num threads for AMD CPU with nvidia GPUs

Yes, I am. Perhaps I can provide screenshots and code examples when I get back home On Sat, Dec 30, 2023 at 11:44 AM Purfview ***@***.***> wrote: > Are you...

Strange behavior for num_workers and num threads for AMD CPU with nvidia GPUs

Code: from faster_whisper import WhisperModel, decode_audio from io import BytesIO import time from fastapi import FastAPI, Request, UploadFile import nvtx import threading import time import time from concurrent.futures import ThreadPoolExecutor...

Strange behavior for num_workers and num threads for AMD CPU with nvidia GPUs

Do you have a sense of why the transcriptions are not really using Tensor Cores but only CUDA cores? Is there any way to improve utilization here?

Strange behavior for num_workers and num threads for AMD CPU with nvidia GPUs

Thanks! Do you have a sense of what are the bottlenecks in the GPU computation? Would it be the number of cores in the CPU? I speculate that's probably not...

revert back to using PyAV instead of torch audio

Will removing torch remove the supposed FFT speedup?

revert back to using PyAV instead of torch audio

Great to hear! I work with segments about 10 seconds long, so no benefit from batching. However, I am curious and possibly interested into bumping up to the latest commit...

More efficient batch inference resulting in large-v2 with *60-70x REAL TIME speed (now in custom v3 branch, see comment for details)

Would batching be able to support multiple audio files? Such as multiple user requests from Triton?