unilm icon indicating copy to clipboard operation
unilm copied to clipboard

About the reimplementtation

Open zxk19981227 opened this issue 2 years ago • 0 comments

Describe the bug Model I am using (UniLM, MiniLM, LayoutLM ...):

The problem arises when using:

  • [ ] the official example scripts: (give details below)
  • [x] my own modified scripts: (give details below)

A clear and concise description of what the bug is. When i tries to use the unilmv1 based model to generate something as mentioned in the appendix of the paper, error occurs that it predict special tokens such as "[PAD]" and "A". The code is as following and no useful information could be found either from the official paper or the github about how could this implemented. To Reproduce Steps to reproduce the behavior:

  1. download the unilm-base-cased
  2. run the code below

Expected behavior A clear and concise description of what you expected to happen. `import torch import sys from unilm.src.pytorch_pretrained_bert.modeling import BertForMaskedLM from transformers import BertTokenizer from torch.nn import Module from utils import create_attention_mask_for_lm from transformers import BertConfig

class GenerateModel(Module): def init(self, model_name): super(GenerateModel, self).init() self.tokenizer = BertTokenizer.from_pretrained('bert-base-cased') config=BertConfig.from_pretrained('bert-base-cased') state_dict=torch.load('/data1/zhouxukun/dynamic_backdoor_attack/pretrained_model/unilm1-base-cased.bin') self.model = BertForMaskedLM.from_pretrained(pretrained_model_name='bert-base-uncased',state_dict=state_dict)

def forward(self, sentence):
    token_ids = self.tokenizer(sentence).input_ids
    eos_location = len(token_ids) - 1
    for i in range(10):
        token_ids.append(0)
    tensor = torch.tensor(token_ids)
    input_sentence = tensor.unsqueeze(0).cpu()
    attention_mask = create_attention_mask_for_lm(input_sentence.shape[-1]).cpu()
    print(attention_mask.shape)
    for i in range(10):
        input_sentence[0][eos_location] = self.tokenizer.mask_token_id
        predictions = self.model(input_ids=input_sentence, attention_mask=attention_mask)[0]
        predictions_words = torch.argmax(predictions[0][eos_location], dim=-1).item()
        input_sentence[0][eos_location] = predictions_words
        eos_location += 1
    print(input_sentence[0])
    return self.tokenizer.convert_ids_to_tokens(input_sentence[0])

if name == "main": model = GenerateModel("microsoft/unilm-base-cased").cpu() tokens = model('Where is the home for panda? The home for panda is') print(model.tokenizer.convert_tokens_to_string(tokens)) `

  • Platform:
  • Python version:
  • PyTorch version (GPU?):

zxk19981227 avatar Mar 19 '22 08:03 zxk19981227