fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

Why is wav2vec 2.0 masking prob set to 0.65?

Open TParcollet opened this issue 2 years ago • 1 comments

❓ Questions and Help

Hi there!

Quick question on the masking of wav2vec 2.0. In the original paper, it is mentioned that each frame has a p=0.065 chance of starting a mask of length M=10. This leads to each frame having a probability of 1 - ( 1 - 0.065 ) ^ 10 = 0.49 of being actually masked. However, in the implementation of the masking function, the probability is used to compute the number of masks given the sequence length and is set to 0.65. This results in 65% of the sequence being masked. Hence, each frame has a 0.65 probability of being masked instead of 0.49. Am I missing something?

Not sure, but maybe @alexeib can help here :-)

Many thanks!

TParcollet avatar Jun 26 '23 18:06 TParcollet

I think it gets to 0.49 because of overlap, check for this variable (masked_pct) in the code and see what number it gets.

orena1 avatar Jul 02 '23 17:07 orena1