fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

Why wav2vec2-base-960h is trained without using attention mask?

Open YWMditto opened this issue 2 years ago • 0 comments

I have seen the code of Wav2Vec2FeatureExtractor in transformers, and it said the model wav2vec2-base-960h is trained without using attention mask.

I wonder why and how the model is trained without using attention mask to mask the pad place.

Does not it make mistakes when allowing the pad place into compute?

image

YWMditto avatar Oct 25 '22 05:10 YWMditto