fairseq
fairseq copied to clipboard
The normalization settings of input audio
❓ Questions and Help
Before asking:
- search the issues.
- search the docs.
What is your question?
In wav2vec2.0 and hubert, the config task.normalize
is set to False
(which means not to normalize the input audio), but data2vec is set to True
, and the original paper also mentioned it. Will it have a big effect on experiment result?
Code
What have you tried?
What's your environment?
- fairseq Version (e.g., 1.0 or main):
- PyTorch Version (e.g., 1.0)
- OS (e.g., Linux):
- How you installed fairseq (
pip
, source): - Build command you used (if compiling from source):
- Python version:
- CUDA/cuDNN version:
- GPU models and configuration:
- Any other relevant information:
@alexeib
it wont have a much of an effect, but you have to match the feature extractor to the normalization setting
normalize in dataloader -> layer norm in feature extractor no normalization in dataloader -> group norm in first block of feature extractor + feature_grad_mult = 0.1 (rescale feature extractor grads by 0.1)