electra_pytorch
electra_pytorch copied to clipboard
modeling_utils.py中temperature_sampling采样的问题
这是我运行run_pretrain.py时的运行报错,显示在temperature_sampling函数中pred_ids = probs.cpu().multinomial(probs.size()[1],replacement=False)这一句存在问题,由于multinomial输入的weights只能是1维或2维,而这里的probs的shape是【32,128,21128】所以出错。请问这里应该如何修改?
python run_pretraining.py --data_dir=dataset/ --vocab_path=prev_trained_model/electra_tiny/vocab.txt --data_name=electra --config_path=prev_trained_model/electra_tiny/config.json --output_dir=outputs/
03/31/2022 11:38:38 - INFO - root - samples_per_epoch: 173385 03/31/2022 11:38:38 - INFO - root - device: cuda , distributed training: False, 16-bits training: False 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Model name 'prev_trained_model/electra_tiny/vocab.txt' not found in model shortcut name list (). Assuming 'prev_trained_model/electra_tiny/vocab.txt' is a path or url to a directory containing tokenizer files. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Didn't find file prev_trained_model/electra_tiny/added_tokens.json. We won't load it. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Didn't find file prev_trained_model/electra_tiny/special_tokens_map.json. We won't load it. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - Didn't find file prev_trained_model/electra_tiny/tokenizer_config.json. We won't load it. 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file prev_trained_model/electra_tiny/vocab.txt 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file None 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file None 03/31/2022 11:38:38 - INFO - model.tokenization_utils - loading file None 03/31/2022 11:38:38 - INFO - model.configuration_utils - loading configuration file prev_trained_model/electra_tiny/config.json 03/31/2022 11:38:38 - INFO - model.configuration_utils - Model config { "attention_probs_dropout_prob": 0.1, "directionality": "bidi", "disc_weight": 50, "embedding_size": 312, "finetuning_task": null, "gen_weight": 1.0, "generator_size": "1/4", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 312, "initializer_range": 0.02, "intermediate_size": 1200, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "num_attention_heads": 12, "num_hidden_layers": 4, "num_labels": 2, "output_attentions": false, "output_hidden_states": false, "pooler_fc_size": 312, "pooler_num_attention_heads": 12, "pooler_num_fc_layers": 3, "pooler_size_per_head": 128, "pooler_type": "first_token_transform", "pruned_heads": {}, "temperature": 1.0, "torchscript": false, "type_vocab_size": 2, "vocab_size": 21128 }
03/31/2022 11:38:41 - INFO - root - ***** Running training *****
03/31/2022 11:38:41 - INFO - root - Num examples = 693540
03/31/2022 11:38:41 - INFO - root - Batch size = 32
03/31/2022 11:38:41 - INFO - root - Num steps = 21673
03/31/2022 11:38:41 - INFO - root - warmup_steps = 2167
03/31/2022 11:38:41 - INFO - root - Loading training examples for dataset/corpus/train/electra_file_0.json
03/31/2022 11:38:43 - INFO - root - Loading complete!
torch.Size([32, 128, 21128])
Traceback (most recent call last):
File "run_pretraining.py", line 363, in