diart
diart copied to clipboard
A python package to build AI-powered real-time audio applications
Trying to find the data to run the benchmarks, and I can't find all the source data: - [x] **VoxConverse** - found it! - [ ] **AMI** - found the...
- [ ] Getting started (audio sources, pipelines, inference) - [ ] Speaker Diarization - [ ] Voice Activity Detection - [ ] Benchmark - [ ] Hyper-parameter tuning -...
### Problem It's getting more and more difficult to tune and evaluate diarization pipelines with different models or combinations of models, even with a GPU. ### Idea Implement a caching...
### Problem Configuring a pipeline and tracking changes is hard with the large amount of arguments. This also leads to duplicated code in the CLI scripts. ### Idea Load configurations...
Hey! Sorry for the long delay. just started a new job and things have been hectic. Orienting myself to the package, and I can't seem to find any tests? JOSS...
`step` controls the minimum _algorithmic latency_ of the speaker diarization pipeline. Targetting real-time processing, one needs to make sure that the _processing latency_ (i.e. the time it takes to process...
**Depends on #144** This PR adds a new `SpeakerAwareTranscription` pipeline that combines streaming diarization and streaming transcription to determine "who says what" in a live conversation. By default, this is...
**Depends on #143** Adding a streaming ASR pipeline needed a big refactoring (that began with #143). This PR continues this effort to allow a new type of pipeline that transcribes...
Hello, I've been trying to get your colored text demo working but nothing seems to happen. I've gotten the basic demo working from this repo and it works fine, but...