diart
diart copied to clipboard
Minimizing missed detection
Iv'e been working on tuning the pipeline for my application which is a real time conversational system, the best results so far are:
36.02% DER, 2.41% false alarm, 20.04% missed detection, and 13.56% confusion
I used my own data which was recorded on the target platform's microphone array and annotated with webrtc vad. our system is reasonably tolerant to false alarms, since we require ASR to lable a speaker, but not to missed detections. Is there a way to make the pipeline less conservative?
I am using segmentation-3.0, default embedding and hyperparameters acheived through tuning : tau=0.493, rho=0.055, delta=0.633. I tried lowering tau to and increasing rho but that just increased the confusion rate. I also tried changing gamma and beta, as i undestood that lowering gamma and beta can yield the desired results but it seems that they have a negligible effect.