DeBERTa
DeBERTa copied to clipboard
The implementation of DeBERTa
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 550. GiB for an array with shape (28235788,) and data type |S20921 How did you solve this mistake? The pre training corpus is read incorrectly. My...
Really like your v3 models a lot, they bring great performance with reasonable size. Are the discriminator and generator versions of the v3 models available somewhere? The v3 models on...
I do not get why would you make a model even worse than a unigram model. And i read it is one of the best in glue task, but I...
Hi, I want to report a issue that I found while running mlm.sh for deberta-base. ## Description - Using mlm.sh script for distributed training with more than 1 nodes causes...
I am planning to pretrain DeBERTa v3 with RTD and Gradient disentagled embedding sharing. But i don't have and proper references and resources on how to start pretraining it.
Would it be possible to share configurations used for training the `small` and `xsmall` versions of DeBERTaV3? Similar to the files provided in the `experiments/language_model/` directory (i.e `deberta_base.json`). Thank you...
how to pretrain mDeBERTa base and small on a custom dataset ? How to structure the Multilingual lingual dataset. I am planning to pretrain mDEBERTa specifically on multiple Indian Languages....
when I run this code, it shows that the length of tokenizer equals to 128001. ``` from transformers import AutoModelForSequenceClassification, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-v3-large') len(tokenizer) ``` But when I load...
I have been trying to use the pretrained `DebertaV2ForMaskedLM` based on the [example code](https://huggingface.co/transformers/model_doc/deberta_v2.html), but it is not working. The following BERT code (for which the example code looks basically...