diart
diart copied to clipboard
A python package to build AI-powered real-time audio applications
Looks like `torchaudio.Resample` is not very fast compared to other libraries implementing resampling in python. See https://github.com/jonashaag/audio-resampling-in-python Looks like we could switch to `soxr` and get a 10x speed increase.
In `RealTimeInference`, resample before `rearrange_audio_stream` so the same audio is not resampled multiple times. Because of how the first 5s buffer is filled at the beginning, this actually means that...
when I run: diart.stream speakers:9, or execute it in a python script it simply sends a notice about sox_io, and then exits. No errors. How do I figure out what's...
Hello all, Thank you for doing this great work! I just updated this code to use faster whisper and I facing repeated words issue when I use initial_prompt param in...
## Problem The amount of parallel pipelines that can run in `Benchmark` is limited because the models need to be copied in each process. ## Idea Serve models in a...
## Problem The implementation of the CLI is a bit messy and mixed with the python API. ## Idea Use [jsonargparse](https://jsonargparse.readthedocs.io/en/stable/) to group `diart.stream`, `diart.tune` and `diart.benchmark` into a single...
I have a working application with real-time transcription feature based on **faster-whisper**. However, after applying **diart** pipeline to my existing application, I get transcription with no diarization. I expect the...
Hi @juanmc2005 , This PR is solving the error of complaining torch doesn't have subclass numpy issue. It's just detaching the torch before call numpy(). Please review and let me...
Updated README embed-extraction pipeline example with new sample rate = 16000. Also updated the print to display the embeds and added hf_token parameter. Let me know if you only want...
Hi, I am trying to run a pipeline to extract embeddings The pipeline I am running is the one in the README: ``` import rx.operators as ops import diart.operators as...