fairseq The normalization settings of input audio

The normalization settings of input audio

Open Ther-nullptr opened this issue 1 year ago • 1 comments

❓ Questions and Help

Before asking:

search the issues.
search the docs.

What is your question?

In wav2vec2.0 and hubert, the config task.normalize is set to False (which means not to normalize the input audio), but data2vec is set to True, and the original paper also mentioned it. Will it have a big effect on experiment result?

Code

What have you tried?

What's your environment?

fairseq Version (e.g., 1.0 or main):
PyTorch Version (e.g., 1.0)
OS (e.g., Linux):
How you installed fairseq (pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Sep 01 '22 13:09 Ther-nullptr

@alexeib

Sep 01 '22 13:09 Ther-nullptr

it wont have a much of an effect, but you have to match the feature extractor to the normalization setting

normalize in dataloader -> layer norm in feature extractor no normalization in dataloader -> group norm in first block of feature extractor + feature_grad_mult = 0.1 (rescale feature extractor grads by 0.1)

Sep 18 '22 06:09 alexeib

fairseq fairseq copied to clipboard

The normalization settings of input audio

❓ Questions and Help

Before asking:

What is your question?

Code

What have you tried?

What's your environment?

fairseq
fairseq copied to clipboard