diart issues

streamig voice activity detection with own model

Hi, thanks for your repository! I trained a model with a specific voice using the pyannote package, and now I want to use your streaming approach in my task. How...

m15kh

Question: How to load models from file?

3

I have download these two models from HF: 1. https://huggingface.co/pyannote/segmentation-3.0/blob/main/pytorch_model.bin (segmentation model) 2. https://huggingface.co/pyannote/wespeaker-voxceleb-resnet34-LM/blob/main/pytorch_model.bin (embedding model) How do I load these models from files? Trying `segmentation = models.SegmentationModel.from_pretrained("PyAnnoteDiarization/pyannote_model_segmentation-3.0.bin", use_hf_token=False) `...

Kavi-Gupta

Periodic Silent Speaker Detection

1

First of all, thanks for the project! I’m using it in a live Speech-to-Text (STT) + diarization setup: https://github.com/QuentinFuxa/whisper_streaming_web I am testing my pipeline using MacOS BlackHole, which routes the...

QuentinFuxa

bug

unclear

Real Time Diarization for Streaming Audio Chunks in Custom ASR Pipeline

4

I have a custom streaming pipeline with a VAD setup that triggers ASR processing only when speech is detected on a small chunk. The pipeline operates in a streaming fashion,...

sprath9

question

Question about rx

2

Is there any particular reason rx library is used? Why do we need asychronous code in this repo? What if we did not use rx at all, and the code...

nikifori

question

RuntimeError: torchaudio_sox::get_info() PosixPath type mismatch on macOS (diart 0.9.2 / torch 2.4.1 / pyannote.audio 3.3.0)

1

Running diart.stream on macOS Sonoma (Apple Silicon) crashes with a RuntimeError: torchaudio_sox::get_info(), apparently because a pathlib.PosixPath object is passed to torchaudio.info, which now expects a str. The same call chain...

ulvi0

bug

Add ReDimNet embedding model

2

Hello How are you? Thanks for contributing to this project. I am trying to add ReDimNet model (https://github.com/IDRnD/ReDimNet) as embedding model of piple-line. But the ReDimNet model does not require...

rose-jinyang

feat: Add WebSocket server with multi-client support

11

### Overview Implements a WebSocket server that can handle audio streams from multiple client connections ### Changes - Added multi-client support to WebSocket server - Created `StreamingInferenceHandler` for managing connections...

janaab11

feature

Tuning parameters are initialized wrong.

1

Based on this code here: https://github.com/juanmc2005/diart/blob/392d53a1b0cd67701ecc20b683bb10614df2f7fc/src/diart/blocks/diarization.py#L50 it seems that attributes like duration and etc. are initialized with an "_" before their name. This raised an issue here: https://github.com/juanmc2005/diart/blob/392d53a1b0cd67701ecc20b683bb10614df2f7fc/src/diart/optim.py#L111 SpeakerDiarizationConfig class...

nikifori

bug

Support for Silero VAD

6

Hi Developers, Thank you for your amazing work on this project! I was wondering if there’s a way to use Silero VAD. I noticed that PyAnnote VAD is supported, but...

tjainsuki

feature

question

diart
diart copied to clipboard

Metadata

streamig voice activity detection with own model

Question: How to load models from file?

Periodic Silent Speaker Detection

Real Time Diarization for Streaming Audio Chunks in Custom ASR Pipeline

Question about rx

RuntimeError: torchaudio_sox::get_info() PosixPath type mismatch on macOS (diart 0.9.2 / torch 2.4.1 / pyannote.audio 3.3.0)

Add ReDimNet embedding model

feat: Add WebSocket server with multi-client support

Tuning parameters are initialized wrong.

Support for Silero VAD

← Metadata

Owner

Metadata

diart diart copied to clipboard

Metadata

← Metadata

Owner

Metadata

diart
diart copied to clipboard