BirdNET-Analyzer
BirdNET-Analyzer copied to clipboard
Where can I find and modify the parameters of the Mel filter bank in the frequency analysis module?
Hello, thank you for your great work on this project. I have a question regarding the audio preprocessing pipeline. According to the literature and documentation, the input audio is first converted into a logarithmic Mel spectrogram, which is standard and expected. However, as I try to adapt the model for whale sound classification, I would like to adjust the parameters of the Mel filter bank (e.g., number of filters, frequency range, etc.) to better suit the characteristics of whale vocalizations. While reviewing the code, I was unable to find any explicit definition or implementation of the Mel filter bank in the Python files. Instead, I found code related to generating a log-frequency spectrogram, which appears to be used instead of a standard Mel spectrogram. Could you please clarify:
-
Where in the codebase the Mel filter bank (or equivalent frequency scaling) is defined?
-
Is the Mel scale used at all, or is it replaced by a different log-frequency scaling method?
-
Which file or function should I look into if I want to modify the frequency resolution or scaling to better capture low-frequency whale calls?
For reference, I have been exploring birdnet_analyzer/utils.py for these functionalities. Any guidance or pointers would be greatly appreciated.
Thank you!