electra_pytorch icon indicating copy to clipboard operation
electra_pytorch copied to clipboard

modeling_utils.py中temperature_sampling采样的问题

Open MrSworder opened this issue 2 years ago • 0 comments

这是我运行run_pretrain.py时的运行报错,显示在temperature_sampling函数中pred_ids = probs.cpu().multinomial(probs.size()[1],replacement=False)这一句存在问题,由于multinomial输入的weights只能是1维或2维,而这里的probs的shape是【32,128,21128】所以出错。请问这里应该如何修改?

python run_pretraining.py --data_dir=dataset/ --vocab_path=prev_trained_model/electra_tiny/vocab.txt --data_name=electra --config_path=prev_trained_model/electra_tiny/config.json --output_dir=outputs/

03/31/2022 11:38:38 - INFO - root - samples_per_epoch: 173385 03/31/2022 11:38:38 - INFO - root - device: cuda , distributed training: False, 16-bits training: False 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Model name 'prev_trained_model/electra_tiny/vocab.txt' not found in model shortcut name list (). Assuming 'prev_trained_model/electra_tiny/vocab.txt' is a path or url to a directory containing tokenizer files. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Didn't find file prev_trained_model/electra_tiny/added_tokens.json. We won't load it. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Didn't find file prev_trained_model/electra_tiny/special_tokens_map.json. We won't load it. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Didn't find file prev_trained_model/electra_tiny/tokenizer_config.json. We won't load it. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file prev_trained_model/electra_tiny/vocab.txt 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file None 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file None 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file None 03/31/2022 11:38:38 - INFO - model.configuration_utils - loading configuration file prev_trained_model/electra_tiny/config.json 03/31/2022 11:38:38 - INFO - model.configuration_utils - Model config { "attention_probs_dropout_prob": 0.1, "directionality": "bidi", "disc_weight": 50, "embedding_size": 312, "finetuning_task": null, "gen_weight": 1.0, "generator_size": "1/4", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 312, "initializer_range": 0.02, "intermediate_size": 1200, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "num_attention_heads": 12, "num_hidden_layers": 4, "num_labels": 2, "output_attentions": false, "output_hidden_states": false, "pooler_fc_size": 312, "pooler_num_attention_heads": 12, "pooler_num_fc_layers": 3, "pooler_size_per_head": 128, "pooler_type": "first_token_transform", "pruned_heads": {}, "temperature": 1.0, "torchscript": false, "type_vocab_size": 2, "vocab_size": 21128 }

03/31/2022 11:38:41 - INFO - root - ***** Running training ***** 03/31/2022 11:38:41 - INFO - root - Num examples = 693540 03/31/2022 11:38:41 - INFO - root - Batch size = 32 03/31/2022 11:38:41 - INFO - root - Num steps = 21673 03/31/2022 11:38:41 - INFO - root - warmup_steps = 2167 03/31/2022 11:38:41 - INFO - root - Loading training examples for dataset/corpus/train/electra_file_0.json 03/31/2022 11:38:43 - INFO - root - Loading complete! torch.Size([32, 128, 21128]) Traceback (most recent call last): File "run_pretraining.py", line 363, in main() File "run_pretraining.py", line 276, in main outputs = model(input_ids=input_ids, File "/home/zorro/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/zorro/NLP/中文预训练语言模型/electra_pytorch_chinese-master/model/modeling_electra.py", line 735, in forward sample_ids = temperature_sampling(g_logits,self.config.temperature) File "/home/zorro/NLP/中文预训练语言模型/electra_pytorch_chinese-master/model/modeling_utils.py", line 763, in temperature_sampling pred_ids = probs.cpu().multinomial(probs.size()[1],replacement=False) RuntimeError: prob_dist must be 1 or 2 dim

MrSworder avatar Mar 31 '22 03:03 MrSworder