DeBERTa
DeBERTa copied to clipboard
DebertaV3: is EMD used in generator and discriminator?
Impressed by the performance debertaV3, and I'm trying to re-produce debertaV3.
I have a question regarding the model architecture in DebertaV3. Is EMD used in generator when doing MLM task? I see the released code for ReplacedTokenDetectionModel and it seems that Discriminator does not use EMD. Can you comfirm that?
Thanks!