diart icon indicating copy to clipboard operation
diart copied to clipboard

A python package to build AI-powered real-time audio applications

Results 85 diart issues
Sort by recently updated
recently updated
newest added

Hi, thanks for your repository! I trained a model with a specific voice using the pyannote package, and now I want to use your streaming approach in my task. How...

I have download these two models from HF: 1. https://huggingface.co/pyannote/segmentation-3.0/blob/main/pytorch_model.bin (segmentation model) 2. https://huggingface.co/pyannote/wespeaker-voxceleb-resnet34-LM/blob/main/pytorch_model.bin (embedding model) How do I load these models from files? Trying `segmentation = models.SegmentationModel.from_pretrained("PyAnnoteDiarization/pyannote_model_segmentation-3.0.bin", use_hf_token=False) `...

First of all, thanks for the project! I’m using it in a live Speech-to-Text (STT) + diarization setup: https://github.com/QuentinFuxa/whisper_streaming_web I am testing my pipeline using MacOS BlackHole, which routes the...

bug
unclear

I have a custom streaming pipeline with a VAD setup that triggers ASR processing only when speech is detected on a small chunk. The pipeline operates in a streaming fashion,...

question

Is there any particular reason rx library is used? Why do we need asychronous code in this repo? What if we did not use rx at all, and the code...

question

Running diart.stream on macOS Sonoma (Apple Silicon) crashes with a RuntimeError: torchaudio_sox::get_info(), apparently because a pathlib.PosixPath object is passed to torchaudio.info, which now expects a str. The same call chain...

bug

Hello How are you? Thanks for contributing to this project. I am trying to add ReDimNet model (https://github.com/IDRnD/ReDimNet) as embedding model of piple-line. But the ReDimNet model does not require...

### Overview Implements a WebSocket server that can handle audio streams from multiple client connections ### Changes - Added multi-client support to WebSocket server - Created `StreamingInferenceHandler` for managing connections...

feature

Based on this code here: https://github.com/juanmc2005/diart/blob/392d53a1b0cd67701ecc20b683bb10614df2f7fc/src/diart/blocks/diarization.py#L50 it seems that attributes like duration and etc. are initialized with an "_" before their name. This raised an issue here: https://github.com/juanmc2005/diart/blob/392d53a1b0cd67701ecc20b683bb10614df2f7fc/src/diart/optim.py#L111 SpeakerDiarizationConfig class...

bug

Hi Developers, Thank you for your amazing work on this project! I was wondering if there’s a way to use Silero VAD. I noticed that PyAnnote VAD is supported, but...

feature
question