diart icon indicating copy to clipboard operation
diart copied to clipboard

A python package to build AI-powered real-time audio applications

Results 85 diart issues
Sort by recently updated
recently updated
newest added

Iv'e been working on tuning the pipeline for my application which is a real time conversational system, the best results so far are: 36.02% DER, 2.41% false alarm, 20.04% missed...

question

It looks like pipeline quickly forgets previous speakers, assigning wrong tags to new ones, so that a conversation of 4-5 people being inferenced as a conversation of 2. I am...

duplicate
question

## Problem Setting up the project is a bit too long with all the dependencies and the use of conda. ## Idea Create and publish docker images with new diart...

ops

### Feature Description I propose the addition of a feature to the DIART project that allows for the persistence and reuse of speaker embeddings across multiple conversations. I am willing...

feature

hello @juanmc2005 I use the hbredin/wespeaker-voxceleb-resnet34-LM (ONNX) model to extract speaker embedding in diarization pipeline, but I found the latency is too large(1300ms) when calculate per chunk with the default...

question

Is there any way to implement [voicefixer](https://github.com/haoheliu/voicefixer_main) to speaker diarization pipeline? The package takes a wav file as input and gives a upsampled 44100kHz wav file as output, but that...

feature

I am trying to run your tutorial on [transcription coloring](https://betterprogramming.pub/color-your-captions-streamlining-live-transcriptions-with-diart-and-openais-whisper-6203350234ef). But I am getting the mentioned error. The library runs fine per "diart.stream microphone". Running on Windows 11 with Python...

question

As segmentation models are getting better, it might make sense to revisit the idea of stitching based on segmentation alone. That's what this (WIP) pipeline does. Also, that was an...

feature

### Problem `LazyModel` makes it rather complicated for someone to add their own model, especially when some changes need to be made to the input/output. The reason `LazyModel` exists is...

feature
API

With pyannote 3.1, we could do only 1 forward pass of the audio instead of `num_speakers` when extracting embeddings with weights. This is probably at least one of the causes...

feature