BirdNET-Analyzer icon indicating copy to clipboard operation
BirdNET-Analyzer copied to clipboard

Where can I find and modify the parameters of the Mel filter bank in the frequency analysis module?

Open Garywatersignal opened this issue 5 months ago • 4 comments

Hello, thank you for your great work on this project. I have a question regarding the audio preprocessing pipeline. According to the literature and documentation, the input audio is first converted into a logarithmic Mel spectrogram, which is standard and expected. However, as I try to adapt the model for whale sound classification, I would like to adjust the parameters of the Mel filter bank (e.g., number of filters, frequency range, etc.) to better suit the characteristics of whale vocalizations. While reviewing the code, I was unable to find any explicit definition or implementation of the Mel filter bank in the Python files. Instead, I found code related to generating a log-frequency spectrogram, which appears to be used instead of a standard Mel spectrogram. Could you please clarify:

  1. Where in the codebase the Mel filter bank (or equivalent frequency scaling) is defined?

  2. Is the Mel scale used at all, or is it replaced by a different log-frequency scaling method?

  3. Which file or function should I look into if I want to modify the frequency resolution or scaling to better capture low-frequency whale calls?

For reference, I have been exploring birdnet_analyzer/utils.py for these functionalities. Any guidance or pointers would be greatly appreciated.

Thank you!

Garywatersignal avatar May 26 '25 12:05 Garywatersignal