RecoverSAT icon indicating copy to clipboard operation
RecoverSAT copied to clipboard

RuntimeError: The expanded size of the tensor (63) must match the existing size (64) at non-singleton dimension 0

Open SunshineBot opened this issue 5 years ago • 4 comments

I've encountered the following question during executing the predict.py with RecoverSAT model:

Traceback (most recent call last):
  File "predict.py", line 93, in <module>
    main()
  File "predict.py", line 83, in main
    results = translator.translate(input_data)
  File "/code/translator.py", line 39, in translate
    batch_pred = self.recover_nat_translate_batch(batch)
  File "/code/translator.py", line 124, in recover_nat_translate_batch
    position = position.expand(cur_bsz, -1)
RuntimeError: The expanded size of the tensor (63) must match the existing size (64) at non-singleton dimension 0

The command I used:

python3 predict.py \
    --model_path $MODEL_DIR/$CKPT.ckp \
    --input_file $TEST_DATA_PREFIX.en \
    --output_file $MEASURE_DIR/test.en.pred.$CKPT \
    --vocab_path $VOCAB_FILE > $LOG_DIR/decode.$STYPE.$CKPT.log 2>&1

I'm not sure what has happened.

SunshineBot avatar Nov 27 '20 07:11 SunshineBot

您好!我训练到1000步后出现如此的错误这是怎么回事儿呢? 我的命令如下: CUDA_VISIBLE_DEVICES='1','4','5','6','7' python train.py --model_name RecoverSAT --segment_num 2 --dataset IWSLT16 --init_encoder_path ./checkpoint-token-zh-ti/b4-epoch-208-batch-441.ckp --train_src_file ../corpus/token/train.token.zh.shuf --train_tgt_file ../corpus/token/train.token.ti.shuf --valid_src_file ../corpus/token/dev.token.zh --valid_tgt_file ../corpus/token/dev.token.ti --vocab_path ../corpus/token/vocab.token.zhti.txt 出现的错误如下: 01/19/2021 08:48:31 - INFO - main - Epoch=0 batch=1000 step=1000 loss=5.383245 /pytorch/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:19: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead. Traceback (most recent call last): File "train.py", line 391, in main() File "train.py", line 354, in main beam_size=beam_size, args=args, device=device) File "/home2/sj/anaconda3/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad return func(*args, **kwargs) File "train.py", line 116, in evaluation result = translator.translate(valid_data) File "/home2/sj/translation/RecoverSAT-master_cp/translator.py", line 38, in translate batch_pred = self.recover_nat_translate_batch(batch) File "/home2/sj/translation/RecoverSAT-master_cp/translator.py", line 127, in recover_nat_translate_batch searcher.search_one_step(logit) File "/home2/sj/translation/RecoverSAT-master_cp/greedysearch.py", line 46, in search_one_step self.is_seg_finished = self.is_seg_finished | cur_seg_finished RuntimeError: Expected object of scalar type Byte but got scalar type Bool for argument #2 'other' in call to _th_or

Shajiu avatar Jan 19 '21 01:01 Shajiu

你应该是用了新版本的pytorch,对类型和操作加强了限制,masked_fill_这个操作已经不支持uint8类型的变量了,只支持bool型变量。你可以回退旧版本的pytorch,或者修改变量类型(直接改原变量类型,或者加类型转换),我之前遇到过这个问题,修改变量类型后就可以了。详细的信息你可以看看pytorch的文档。

SunshineBot avatar Jan 19 '21 01:01 SunshineBot

谢谢您呀!

Shajiu avatar Jan 19 '21 07:01 Shajiu

Maybe should change the size of position_base to (1 * segment_num) rather than (batch_size * segment_num) when initializing. It's because we can't expend the tensor with the size in dimension 0 larger than 1.

Coda-s avatar Oct 19 '21 10:10 Coda-s