DeBERTa
DeBERTa copied to clipboard
The implementation of DeBERTa
Hi What are the parameters of Deberta finetune superglue for each task, such as batch, GPU cards, learning rate, etc.? I couldn't find the detailed parameters of each task in...
The paper says you add the absolte position embeddings after all Transformer layers, before softmax layer for MLM, however, I could not find these parameters. looking forward to your response....
in disentangled_attention.py pos_query_layer's dimension is 3, but when select p2p attention, this code get IndexError ----------------------------- pos_query = pos_query_layer[:,:,att_span:,:] ---------------------------------------- test code: ------------------------------- ########################################## import os os.chdir('F:\\WorkSpace\\DeBERTa-master') import numpy as...
An error occurred while run in class DisentangledSelfAttention.forward() where query_states.size(1) > hidden_states.size(1): https://github.com/microsoft/DeBERTa/blob/master/DeBERTa/deberta/disentangled_attention.py line 165: p2c_att = torch.gather(p2c_att, dim=-2, index=pos_index.expand(p2c_att.size()[:2] + (pos_index.size(-2), key_layer.size(-2))))
`self.deberta = deberta.DeBERTa(pre_trained='base')` when pre_trained='base','larget','xlarge', throw Traceback (most recent call last): File "/home/v-weishengli/Downloads/pycharm-community-2020.2.2/plugins/python-ce/helpers/pydev/pydevd.py", line 1448, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/home/v-weishengli/Downloads/pycharm-community-2020.2.2/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile...
Any one know where to get them? Thank you and thank you.
code: ```python self.deberta = deberta.DeBERTa(pre_trained="/path/to/pretrained_dir/pytorch_model.bin") self.deberta.apply_state() ``` message: ``` File "/home/user/DeBERTa/DeBERTa/deberta/deberta.py", line 143, in key_match assert len(c)==1, (c, s, key) AssertionError: ([], dict_keys(['deberta.embeddings.word_embeddings.weight', 'deberta.embeddings.LayerNorm.weight', 'deberta.embeddings.LayerNorm.bias', 'deberta.encoder.layer.0.attention.self.q_bias', 'deberta.encoder.layer.0.attention.self.v_bias', 'deberta.encoder.layer.0.attention.self.in_proj.weight', 'deberta.encoder.layer.0.attention.self.pos_proj.weight', 'deberta.encoder.layer.0.attention.self.pos_q_proj.weight',...
Hello there, Are there any instructions on how to pretrain DeBERTa from scratch? Thanks
Just curious about the switch of tokenizer in V2, can you share why you switched? And what training settings for SentencePiece you used to train the v2 spm tokenizer? Was...
In `deberta.mlm`, `MaskedLayerNorm ` is not imported from `deberta.ops`, and `PreLayerNorm` is undefined. And I'm not sure if `deberta.mlm` contains codes for pretraining?