rnnoise
rnnoise copied to clipboard
16kHz conversion
Thanks for the great project.! I really get a good inspiration on it.
Now I'm trying to convert from 48kHz samplingrate base code to 16kHz sampling rate code.
I change some parameters like followings.
- INPUT_FEATURE_LEN : 42 -> 38
- OUTPUT_GAIN_LEN : 22 -> 18 It was changed in accordance with eband5ms table. static const opus_int16 eband5ms[] = { /0 200 400 600 800 1k 1.2 1.4 1.6 2k 2.4 2.8 3.2 4k 4.8 5.6 6.8 8k 9.6 12k 15.6 20k/ 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, 28, 34, 40, 48, 60, 78, 100 };
and also change pitch related factors. #define PITCH_MIN_PERIOD 60 ->20 #define PITCH_MAX_PERIOD 768 ->256 #define PITCH_FRAME_SIZE 960->320
is it correct? I tried neural network training after changing above params. But, it looks not working.. (there is just audio's gain suppression)
Is there anyone who can answer my question? Thanks.
If you make any change to the way the features are computed (which is the case if you change the number of bands), then you need to retrain all the weights. If you just want to use the existing code (and weights) as is, the simplest would just be to resample from 16 kHz to 48 kHz and then back at the end.
Thanks for your quick answer! I'm trying to change features, so I need to retrain all weights. So I did, but the result was not good(there is only just audio's gain suppression) My real question is,, is it correct in following way(If I change samplingrate from 48k to 16k)? In denoise.c file #define PITCH_MIN_PERIOD 60 ->20 #define PITCH_MAX_PERIOD 768 ->256 #define PITCH_FRAME_SIZE 960->320 I'm sorry to bother you. Actually, I'm not familiar with pitch estimation code from opus. I wonder my approach is okay. Thanks.
Thanks for your quick answer! I'm trying to change features, so I need to retrain all weights. So I did, but the result was not good(there is only just audio's gain suppression) My real question is,, is it correct in following way(If I change samplingrate from 48k to 16k)? In denoise.c file #define PITCH_MIN_PERIOD 60 ->20 #define PITCH_MAX_PERIOD 768 ->256 #define PITCH_FRAME_SIZE 960->320 I'm sorry to bother you. Actually, I'm not familiar with pitch estimation code from opus. I wonder my approach is okay. Thanks.
Hello enting0608 You say u try to convert from 48kHz sampling rate base code to 16kHz sampling rate code You have changed some parameters, but the result was not good I am trying to do the same thing just like you. And I have few questions, if you can discuss with me I will really really appreciate.
-
Do I have to change the
FRAME_SIZE
? The defaultFRAME_SIZE
is 480 Do I have to change it to 160? Or if I just want to use the 16kHz file as input and still have the same delay time There is no need to change theFRAME_SIZE
? -
Have you done this converting sampling rate task? If you have done What parameters should I change exactly?
Thank you!
@enting0608 @nerv3890 Where you guys able to Train Rnnoise using 16khz and attain same quality as with 48khz ? What are the changes required.
If you make any change to the way the features are computed (which is the case if you change the number of bands), then you need to retrain all the weights. If you just want to use the existing code (and weights) as is, the simplest would just be to resample from 16 kHz to 48 kHz and then back at the end.
@jmvalin what would be changes in the code for if i want to use the code with 16khz input , ( i dont want to resample 16 to 48khz), code changes for both training and inference in case of 16khz
There seems to some work done by Gregor with this commit https://github.com/GregorR/rnnoise-nu/commit/53f34de7d95af80c0c9101c791db47a05ec36196 in github.com/GregorR/rnnoise-nu
I look forward to something ready-to-use at 8kHz for denoising human speech.
Any update on how to fine tune hyperparameter for 8 or 16 k.