DeBERTa icon indicating copy to clipboard operation
DeBERTa copied to clipboard

The implementation of DeBERTa

Results 77 DeBERTa issues
Sort by recently updated
recently updated
newest added

Hi What are the parameters of Deberta finetune superglue for each task, such as batch, GPU cards, learning rate, etc.? I couldn't find the detailed parameters of each task in...

The paper says you add the absolte position embeddings after all Transformer layers, before softmax layer for MLM, however, I could not find these parameters. looking forward to your response....

in disentangled_attention.py pos_query_layer's dimension is 3, but when select p2p attention, this code get IndexError ----------------------------- pos_query = pos_query_layer[:,:,att_span:,:] ---------------------------------------- test code: ------------------------------- ########################################## import os os.chdir('F:\\WorkSpace\\DeBERTa-master') import numpy as...

An error occurred while run in class DisentangledSelfAttention.forward() where query_states.size(1) > hidden_states.size(1): https://github.com/microsoft/DeBERTa/blob/master/DeBERTa/deberta/disentangled_attention.py line 165: p2c_att = torch.gather(p2c_att, dim=-2, index=pos_index.expand(p2c_att.size()[:2] + (pos_index.size(-2), key_layer.size(-2))))

`self.deberta = deberta.DeBERTa(pre_trained='base')` when pre_trained='base','larget','xlarge', throw Traceback (most recent call last): File "/home/v-weishengli/Downloads/pycharm-community-2020.2.2/plugins/python-ce/helpers/pydev/pydevd.py", line 1448, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/home/v-weishengli/Downloads/pycharm-community-2020.2.2/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile...

code: ```python self.deberta = deberta.DeBERTa(pre_trained="/path/to/pretrained_dir/pytorch_model.bin") self.deberta.apply_state() ``` message: ``` File "/home/user/DeBERTa/DeBERTa/deberta/deberta.py", line 143, in key_match assert len(c)==1, (c, s, key) AssertionError: ([], dict_keys(['deberta.embeddings.word_embeddings.weight', 'deberta.embeddings.LayerNorm.weight', 'deberta.embeddings.LayerNorm.bias', 'deberta.encoder.layer.0.attention.self.q_bias', 'deberta.encoder.layer.0.attention.self.v_bias', 'deberta.encoder.layer.0.attention.self.in_proj.weight', 'deberta.encoder.layer.0.attention.self.pos_proj.weight', 'deberta.encoder.layer.0.attention.self.pos_q_proj.weight',...

Hello there, Are there any instructions on how to pretrain DeBERTa from scratch? Thanks

Just curious about the switch of tokenizer in V2, can you share why you switched? And what training settings for SentencePiece you used to train the v2 spm tokenizer? Was...

In `deberta.mlm`, `MaskedLayerNorm ` is not imported from `deberta.ops`, and `PreLayerNorm` is undefined. And I'm not sure if `deberta.mlm` contains codes for pretraining?