transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Add DEBERTA-base model for usage in EncoderDecoderModel.

Open avacaondata opened this issue 3 years ago • 9 comments

🚀 Feature request

Add DEBERTA-base model as an option for creating an EncoderDecoderModel.

Motivation

Currently only BERT and RoBERTa models can be transformed to a Seq2Seq model via EncoderDecoder class, and for those of use developing DeBERTa models from scratch it would be wonderful to be able to generate a Seq2Seq model from them. Also, the Deberta-base model works much better than BERT and RoBERTa.

Your contribution

avacaondata avatar Jun 30 '21 11:06 avacaondata

Great idea! Do you want to take a stab at it?

LysandreJik avatar Jun 30 '21 11:06 LysandreJik

Hi @alexvaca0, This is an interesting feature, But I was curious that Deberta, Bert, and Roberta are encoder-based models so there is no decoder part right? I checked their model class and I could not find the Decoder / EncoderDecoder class! Can you please give more insight into it?

bhadreshpsavani avatar Jul 01 '21 06:07 bhadreshpsavani

That's right, there's no decoder in those models, but there is a class in Transformers, EncoderDecoderModel, that enables to create encoder-decoder architectures from encoder-only architectures :)

Perfect, let me have a look at it and see if I can code that adaptation @LysandreJik

avacaondata avatar Jul 01 '21 11:07 avacaondata

Great! If you run into any blockers, feel free to ping us. If you want to add the possibility for DeBERTa to be a decoder, you'll probably need to add the cross attention layers.

cc @patrickvonplaten and @patil-suraj which have extensive experience with enc-dec models.

LysandreJik avatar Jul 01 '21 13:07 LysandreJik

Hey, is this feature being worked on by someone? If not then I can pick it up! @LysandreJik

manish-p-gupta avatar Jan 22 '23 12:01 manish-p-gupta

Would be great if you could pick it up @manish-p-gupta :-)

patrickvonplaten avatar Jan 22 '23 20:01 patrickvonplaten

Great!. Any specific things I should go through before taking it up? I'm familiar with the Code of conduct and contributing guidelines. I'll also open a draft PR to carry on the discussions there. Let me know if you think I need to look at anything else. @patrickvonplaten

manish-p-gupta avatar Jan 23 '23 02:01 manish-p-gupta

@ArthurZucker has been working with DeBERTa models recently and can likely help and give advice!

LysandreJik avatar Jan 25 '23 19:01 LysandreJik

Yes! Feel free to ping me for an early review if you have any doubts

ArthurZucker avatar Jan 26 '23 13:01 ArthurZucker