wespeaker icon indicating copy to clipboard operation
wespeaker copied to clipboard

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Results 53 wespeaker issues
Sort by recently updated
recently updated
newest added

When the wespeaker is applied on torch>=2.1, it will output this error: " > [ WARNING : 2024-07-20 17:11:39,248 ] - error to parse id07100/uUtjsdtDOkQ/00327.wav.wav > [ WARNING : 2024-07-20...

We will do the updates for wespeaker in the following weeks: - [ ] Support SSL pretrained frontend such as WavLM - [ ] Support architectures that accept raw waves...

Hi, First of all, I want to thank for your contribution. Today, I use your example to retrain voxceleb/Resnet34. Dataset is default vox1, vox2 (download via your default script utils)...

will there be support to process multiple files in the GPU at a time?

[rank0]: forward(__torch__.torch.nn.modules.container.___torch_mangle_16.Sequential self, Tensor input) -> Tensor: [rank0]: Expected a value of type 'Tensor (inferred)' for argument 'input' but instead found type 'Optional[Tensor]'. [rank0]: Inferred 'input' to be of type...

Hi, I notice hamming window is used instead of the default povey in onnx inference demo https://github.com/wenet-e2e/wespeaker/blob/master/wespeaker/bin/infer_onnx.py#L47 . May I know the reason for using this? Are all models trained...

Hi, I am trying to reproduce the redimnetB2 results. I trained the model and got on Vox-O: 0.7% (no LM, no ASNORM), 0.61% (no LM, with ASNORM), 0.56 (no LM,...

您好,我用此代码在一些数据集上进行测试,发现会有很高的漏检率MS,导致较高的DER,需要修改代码的什么地方吗? ![Image](https://github.com/user-attachments/assets/a2980ac3-68bf-4cc0-a89e-bf3cad78d244) ![Image](https://github.com/user-attachments/assets/671a0520-4117-4920-8069-b620fec6e3af)

在使用wespeaker的过程中,发现很多时候无法把说话人分离开,比如附件里的这个录音,是一男一女两个人在对话,音色的差别听上去还挺大的,但是最后测试的结果是下面这样的。所以我的问题是,有没有什么参数,比如相似度之类的,可以提升准确率。我仔细看了Speaker这个类,但是没有收获: ``` ('unk', 0.1, 1.9, 0) ('unk', 2.0, 4.1, 0) ('unk', 4.7, 5.7, 0) ('unk', 29.8, 30.3, 0) ('unk', 32.5, 33.2, 0) ('unk', 33.5, 36.1, 0) ('unk', 36.5, 38.9, 0)...

I'm trying to train DINO ssl with my own dataset (1.2M samples) and now the training process is very very slow although my dataset is stored as shard files. This...