DeBERTa issues

out of memory

18

numpy.core._exceptions._ArrayMemoryError: Unable to allocate 550. GiB for an array with shape (28235788,) and data type |S20921 How did you solve this mistake? The pre training corpus is read incorrectly. My...

Amazing-J

Sharing DeBERTa-v3 discriminator and generator with task-specific heads?

1

Really like your v3 models a lot, they bring great performance with reasonable size. Are the discriminator and generator versions of the v3 models available somewhere? The v3 models on...

MoritzLaurer

This model for MLM is waste of time, why did you even made it if it cannot be used?

5

I do not get why would you make a model even worse than a unigram model. And i read it is one of the best in glue task, but I...

Oxi84

Evaluation hangs for distributed MLM task

7

Hi, I want to report a issue that I found while running mlm.sh for deberta-base. ## Description - Using mlm.sh script for distributed training with more than 1 nodes causes...

dannyel2511

How to pretrain DeBERTa v3 ??

I am planning to pretrain DeBERTa v3 with RTD and Gradient disentagled embedding sharing. But i don't have and proper references and resources on how to start pretraining it.

BinhMinhs10

DeBERTaV3 small & xsmall pre-training configuration?

2

Would it be possible to share configurations used for training the `small` and `xsmall` versions of DeBERTaV3? Similar to the files provided in the `experiments/language_model/` directory (i.e `deberta_base.json`). Thank you...

zealotjin

DeBERTa-V3 Pre-training code ?

16

kamalkraj

How to pretrain mDeBERTa ?

24

how to pretrain mDeBERTa base and small on a custom dataset ? How to structure the Multilingual lingual dataset. I am planning to pretrain mDEBERTa specifically on multiple Indian Languages....

StephennFernandes

Embedding layer vocab size not match to tokenizer length

1

when I run this code, it shows that the length of tokenizer equals to 128001. ``` from transformers import AutoModelForSequenceClassification, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-v3-large') len(tokenizer) ``` But when I load...

kingbone9

Pretrained models for masked LM do not work as expected

9

I have been trying to use the pretrained `DebertaV2ForMaskedLM` based on the [example code](https://huggingface.co/transformers/model_doc/deberta_v2.html), but it is not working. The following BERT code (for which the example code looks basically...

AnandA777

DeBERTa
DeBERTa copied to clipboard

Metadata

out of memory

Sharing DeBERTa-v3 discriminator and generator with task-specific heads?

This model for MLM is waste of time, why did you even made it if it cannot be used?

Evaluation hangs for distributed MLM task

How to pretrain DeBERTa v3 ??

DeBERTaV3 small & xsmall pre-training configuration?

DeBERTa-V3 Pre-training code ?

How to pretrain mDeBERTa ?

Embedding layer vocab size not match to tokenizer length

Pretrained models for masked LM do not work as expected

← Metadata

Owner

Metadata

DeBERTa DeBERTa copied to clipboard

Metadata

← Metadata

Owner

Metadata

DeBERTa
DeBERTa copied to clipboard