Sylvain Gugger
Sylvain Gugger
Thanks for explaining @nbroad1881 I now understand the problem a little bit better. I don't think we can avoid having two classes for masked LM (for instance `OldDebertaForMaskedLM` and `NewDebertaForMaskedLM`)...
It needs to be the weights/bias that have the vocab_size dim.
Leave those two to None for now then. I'll add that in the followup PR.
The classes `OldDebertaForMaskedLM` and `NewDebertaForMaskedLM` are not meant to be public. This is an internal artifact to maintain backward compatibility, the user will only use the `DebertaForMaskedLM` class and a...
@nbroad1881 Do you want me to fully take over on this?
Ok, will have a look early next week!
Yes, that would be great! And if you could fix the merge conflicts/rebase on main that would also be awesome!
If you have merged your branch with upstream then just a push should be enough to resolve the conflicts. To rebase just do this on your branch: ``` git rebase...
Thanks again and sorry for the long wait on this PR!
For the tests, you will need to rebase on main so we can have them run (I fixed the command launching them yesterday). You should also create decorators `require_sudashi` and...