AjianIronSide
AjianIronSide
Seems the same problem when using multi_cn model and yaml
The worker log is like: 2020-03-11 07:26:14 - INFO: decoder2: 67a2f518-a6ca-425b-8626-325027fc1599: Initializing request 2020-03-11 07:26:14 - INFO: decoder2: 67a2f518-a6ca-425b-8626-325027fc1599: Setting caps to audio/x-raw, layout=(string)interleaved, rate=(int)16000, format=(string)S16LE, channels=(int)1 2020-03-11 07:26:14 -...
plz try to use encoder': torch.load('../pretrained_models/labelencoders/teacher.pth') instead of using 'encoder': torch.load('../labelencoders/vad.pth') Actually I am not sure how the label encoder are trained or generated. Could you please simply explain how...
Yes, the fine-tunning models against the student/teacher model you provided. Your model is so good at rejecting noise. If speech is with complicated background noise, it is very likely to...
Yeah, I tried. Sadly, not good after tunning