openl3 icon indicating copy to clipboard operation
openl3 copied to clipboard

Make linear frontend consistent with mel

Open auroracramer opened this issue 3 years ago • 0 comments

The linear frequency spectrogram frontend results in 197 frames instead of 199, which seems to be caused by:

  • In _librosa_linear_frontend (openl3/core.py), center=False should be center=True when calling librosa.stft
  • In _construct_linear_audio_network (openl3/models.py), pad_end=True should be added to argument for __fix_kapre_spec
  • In _construct_linear_audio_network (openl3/models.py), in the else block corresponding to if include_frontend, the input_shape should account for centering

In addition to fixing these, we'll need to regenerate the regression data likely

auroracramer avatar Aug 10 '21 15:08 auroracramer