PercepNet icon indicating copy to clipboard operation
PercepNet copied to clipboard

comb filter M = 3,and PITCH_MAX_PERIOD 768, how to meet the 40ms look-ahead requirement?

Open cookcodes opened this issue 4 years ago • 1 comments

In paper: "To achieve 40 ms look-ahead including the 10-ms overlap, we use M = 3" 40 ms look ahead, it means have only 30ms data to shift for comb filter in time domain. 30ms data = 3 * 480 = 1440 And M = 3, PITCH_MAX_PERIOD 768, the maxim shift will be 3 * 768 = 2340

Now in code, FRAME_LOOKAHEAD is set to 5, 5 * 480 > 3 * 768, but it will lead to 60ms look ahead.

In paper:"Low-Complexity, Real-Time Joint Neural Echo Control and Speech Enhancement Based On PercepNet", M is changed to 2. But even if M is 2, 2 * 768 > 3 * 480.

So what is the correct parameter M, PITCH_MAX_PERIOD to meet 40ms look ahead requirement?

cookcodes avatar Jul 27 '21 09:07 cookcodes

Frame Lookahead is needed for internal reason have to care about overflow in pitch filter buffer I draw scratch about pitch buffer in case FRAME LOOKAHEAD is 3(current FRAME_LOOKAHEAD is 5) frame_size is 480 and window size is 960. assume max pitch period is 768 according to rapt module from Rnnoise. and we need 3 delayed stft window for comb filtering(COMB_M=3) which is yellow windows describe below 7683+960-960 > 4803 so it occur overflow

image

we can meet 40ms look-ahead if we can change PITCH_MAX_PERIOD 768 -> 512 and FRAME_LOOKAHEAD 5->3.

jzi040941 avatar Jul 30 '21 04:07 jzi040941