diart The latency of wespeaker model is to large

The latency of wespeaker model is to large

Open SheenChi opened this issue 1 year ago • 1 comments

trafficstars

hello @juanmc2005 I use the hbredin/wespeaker-voxceleb-resnet34-LM (ONNX) model to extract speaker embedding in diarization pipeline, but I found the latency is too large(1300ms) when calculate per chunk with the default params (chunk=5s, step=0.5s, latency=0.5), this can not meet the real time requirement. I found you post the delay performance is 48ms when use cpu and 15ms use gpu. Is there anything I need to pay attention to when reproducing your performance。 Thank you very much for any suggestions

Dec 21 '23 02:12 SheenChi

diart diart copied to clipboard

The latency of wespeaker model is to large

diart
diart copied to clipboard