rnnoise
                                
                                 rnnoise copied to clipboard
                                
                                    rnnoise copied to clipboard
                            
                            
                            
                        Understanding the per-frequency gain applied per band
Hello,
I'm having a hard time understanding the function interp_band_gain.
From the paper it says that the gain applied to the FFT at each frequency bin is the sum of all the amplitudes of the bands to which that frequency belongs.
In code it looks like:
  if (!silence) {
    compute_rnn(&st->rnn, g, &vad_prob, features);
    pitch_filter(X, P, Ex, Ep, Exp, g);
    for (i=0;i<NB_BANDS;i++) {
      float alpha = .6f;
      g[i] = MAX16(g[i], alpha*st->lastg[i]);
      st->lastg[i] = g[i];
    }
    interp_band_gain(gf, g);
#if 1
    for (i=0;i<FREQ_SIZE;i++) {
      X[i].r *= gf[i];
      X[i].i *= gf[i];
    }
#endif
The code for interp_band_gain is:
void interp_band_gain(float *g, const float *bandE) {
  int i;
  memset(g, 0, FREQ_SIZE);
  for (i=0;i<NB_BANDS-1;i++)
  {
    int j;
    int band_size;
    band_size = (eband5ms[i+1]-eband5ms[i])<<FRAME_SIZE_SHIFT;
    for (j=0;j<band_size;j++) {
      float frac = (float)j/band_size;
      g[(eband5ms[i]<<FRAME_SIZE_SHIFT) + j] = (1-frac)*bandE[i] + frac*bandE[i+1];
    }
  }
}
To my knowledge the Bark frequency/critical bands are not overlapping. So how can any 1 frequency belong to more than 1 band?
Why would I not just do (pseudocode):
band_gains = float[24];
for (j = 0; j < nfft; ++j)
    float frequency_bin = j * sample_rate/nfft;
    if (band_0_left < frequency_bin < band_0_right)
        fft[j] *= band_0_gain;
    else if (band_1_left < frequency_bin < band_1_right)
        fft[j] *= band_1_gain;
    ...
Where do these magic values come from?
static const opus_int16 eband5ms[] = {
/*0  200 400 600 800  1k 1.2 1.4 1.6  2k 2.4 2.8 3.2  4k 4.8 5.6 6.8  8k 9.6 12k 15.6 20k*/
  0,  1,  2,  3,  4,  5,  6,  7,  8, 10, 12, 14, 16, 20, 24, 28, 34, 40, 48, 60, 78, 100
};
This looks inherited from Opus codebases. Looks like some transform of Bark band frequency edges to DFT indices?
How can I create my own.
For all I know fromm the paper, the band split is inherited from Opus codec, and it is just a approximation of the Bark scale.
"Rather than rectangular bands, we use triangular bands, with the peak response being at the boundary between bands. "
Adjacent bands should be overlapped.