transformers
transformers copied to clipboard
Add DEBERTA-base model for usage in EncoderDecoderModel.
🚀 Feature request
Add DEBERTA-base model as an option for creating an EncoderDecoderModel.
Motivation
Currently only BERT and RoBERTa models can be transformed to a Seq2Seq model via EncoderDecoder class, and for those of use developing DeBERTa models from scratch it would be wonderful to be able to generate a Seq2Seq model from them. Also, the Deberta-base model works much better than BERT and RoBERTa.
Your contribution
Great idea! Do you want to take a stab at it?
Hi @alexvaca0, This is an interesting feature, But I was curious that Deberta, Bert, and Roberta are encoder-based models so there is no decoder part right? I checked their model class and I could not find the Decoder / EncoderDecoder class! Can you please give more insight into it?
That's right, there's no decoder in those models, but there is a class in Transformers, EncoderDecoderModel, that enables to create encoder-decoder architectures from encoder-only architectures :)
Perfect, let me have a look at it and see if I can code that adaptation @LysandreJik
Great! If you run into any blockers, feel free to ping us. If you want to add the possibility for DeBERTa to be a decoder, you'll probably need to add the cross attention layers.
cc @patrickvonplaten and @patil-suraj which have extensive experience with enc-dec models.
Hey, is this feature being worked on by someone? If not then I can pick it up! @LysandreJik
Would be great if you could pick it up @manish-p-gupta :-)
Great!. Any specific things I should go through before taking it up? I'm familiar with the Code of conduct and contributing guidelines. I'll also open a draft PR to carry on the discussions there. Let me know if you think I need to look at anything else. @patrickvonplaten
@ArthurZucker has been working with DeBERTa models recently and can likely help and give advice!
Yes! Feel free to ping me for an early review if you have any doubts