PointNeXt
PointNeXt copied to clipboard
RuntimeError: The size of tensor a (719348) must match the size of tensor b (1438695) at non-singleton dimension 0
Hi @guochengqian
Have you ever met this bug using two GPU cards? Thanks.
100%|██████[07/26 11:32:16 S3DIS]: Epoch 100 LR 0.000012 train_miou 95.15, val_miou 68.49, best val miou 69.55 100%|██████████| 34/34 [00:41<00:00, 1.22s/it] [07/26 11:32:18 S3DIS]: Best ckpt @E68, val_oa 89.78, val_macc 76.22, val_miou 69.55, iou per cls is: [93.14 97.92 83.66 0. 43.12 54.75 75.45 81.45 90.8 74.57 75.66 73.42 60.24] [07/26 11:32:18 S3DIS]: Successful Loading the ckpt from log/s3dis/s3dis-train-pointnext-xl-ngpus2-seed7272-20220725-213543-eYW6GZURAs6oyghwFxnAPs/checkpoint/s3dis-train-pointnext-xl-ngpus2-seed7272-20220725-213543-eYW6GZURAs6oyghwFxnAPs_ckpt_best.pth [07/26 11:32:18 S3DIS]: ckpts @ 68 epoch( {'best_val': 69.55094146728516} ) 0%| | 0/68 [00:00<?, ?it/s] 0%| | 0/68 [00:27<?, ?it/s]
Traceback (most recent call last):
File "examples/segmentation/main.py", line 529, in
-- Process 0 terminated with the following error: Traceback (most recent call last): File "/export/home/myname/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, *args) File "/export/home/myname/Documents/PointNeXt_code/PointNeXt/examples/segmentation/main.py", line 211, in main test_miou, test_macc, test_oa, test_ious, test_accs, _ = test_entire_room(model, cfg.dataset.common.test_area, cfg) File "/export/home/myname/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/export/home/myname/Documents/PointNeXt_code/PointNeXt/examples/segmentation/main.py", line 452, in test_entire_room cm.update(all_logits.argmax(dim=1), label) File "/export/home/myname/anaconda3/envs/openpoints/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/export/home/myname/Documents/PointNeXt_code/PointNeXt/examples/segmentation/../../openpoints/utils/metrics.py", line 69, in update unique_mapping = true.flatten() * self.virtual_num_classes + pred.flatten() RuntimeError: The size of tensor a (719348) must match the size of tensor b (1438695) at non-singleton dimension 0
As @xindeng98 mentioned in https://github.com/guochengqian/PointNeXt/issues/18#issuecomment-1182670679, this error was caused by the fact that test_entire_room does not support multi-gpus testing.
Hi @haibo-qiu
Yes, it's caused by multi-gpus testing. Now, just using 1 GPU is OK.
Thanks @haibo-qiu. I will write a more clear documentation