fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

Did XLM-R applied subword regularization?

Open mani-rai opened this issue 1 year ago • 0 comments

Looking at "MultilingualMaskedLMTask" code, dictionaries seems to be required to setup this task. To build the dictionaries, we require to preprocess the sentence pieces upfront. Preprocessing raw text upfront doesn't allows regularization noise. So the code doesn't seem to apply regularization. Is it that XLM-R doesn't apply subword regularization?

mani-rai avatar Jul 18 '22 09:07 mani-rai