Lucky Wong
Lucky Wong
A modified pitch correlation inspired by section 3.1 Features and Quantization: Pitch. Paper: A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet Link: https://jmvalin.ca/papers/lpcnet_codec.pdf
Training command is: ```shell ./pruned_transducer_stateless7/train.py \ --world-size 8 \ --num-epochs 90 \ --use-fp16 1 \ --max-duration 200 \ --exp-dir pruned_transducer_stateless7/exp \ --feedforward-dims "1024,1024,2048,2048,1024" \ --master-port 12535 ``` The tensorboard log...
I notice [lhotse-speech](https://github.com/lhotse-speech/lhotse/blob/5ec9baaaac43666c8c6985fe02b459da7ddc1428/lhotse/augmentation/wpe.py#L66) has used `nara_wpe` with PyTorch version. But `torch.linalg.solve` has error for some audio. So I replace `torch.linalg.solve` with `_stable_solve` .
I have tested blind_test_set_interspeech2021, I force the enhanced audio to all zeros samples, and I find a bug in [MOS local Version 5](https://github.com/microsoft/AEC-Challenge/tree/main/AECMOS/AECMOS_local), because all scores are very high, but...
Fix redefinition of unused function issue.