GAN_Mask-RCNN ValueError: Target size(torch.size([246,28,28])) must be the same as input as input size (torch.size([246,48,48]))

When I run the train.py, it raises a ValueError: Target size (torch.Size([245, 28, 28])) must be the same as input size (torch.Size([246, 48, 48])). The traceback is below:

`python -m torch.distributed.la unch --nproc_per_node=$NGPUS tools/train.py --config-file "configs/my_e2e_mask_rcnn_R_101_FPN_1x_phone.yaml"

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

2021-01-15 18:23:14,887 maskrcnn_benchmark INFO: Using 2 GPUs 2021-01-15 18:23:14,887 maskrcnn_benchmark INFO: Namespace(config_file='configs/my_e2e_mask_rcnn_R_101_FPN_1x_phone.yaml', d_ckpt=None, distributed=True, local_rank=0, opts=[], skip_test=False) 2021-01-15 18:23:14,887 maskrcnn_benchmark INFO: Collecting env info (might take some time) 2021-01-15 18:23:16,917 maskrcnn_benchmark INFO: PyTorch version: 1.2.0 Is debug build: No CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 16.04.7 LTS GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 CMake version: Could not collect

Python version: 3.6 Is CUDA available: Yes CUDA runtime version: 10.1.105 GPU models and configuration: GPU 0: GeForce RTX 2080 Ti GPU 1: GeForce RTX 2080 Ti

Nvidia driver version: 440.82 cuDNN version: /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7

Versions of relevant libraries: [pip3] numpy==1.19.5 [pip3] torch==1.2.0 [pip3] torchvision==0.2.1 [conda] blas 1.0 mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main [conda] mkl 2020.2 256 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main [conda] mkl-service 2.3.0 py36he8ac12f_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main [conda] mkl_fft 1.2.0 py36h23d657b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main [conda] mkl_random 1.1.1 py36h0573a6f_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main [conda] pytorch 1.2.0 py3.6_cuda10.0.130_cudnn7.6.2_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch [conda] torchvision 0.2.1 py_2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch Pillow (8.1.0) 2021-01-15 18:23:16,918 maskrcnn_benchmark INFO: Loaded configuration file configs/my_e2e_mask_rcnn_R_101_FPN_1x_phone.yaml 2021-01-15 18:23:16,918 maskrcnn_benchmark INFO: MODEL: META_ARCHITECTURE: "GeneralizedRCNN" #WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101" WEIGHT: "" DEVICE: "cuda" BACKBONE: CONV_BODY: "R-101-FPN" RESNETS: BACKBONE_OUT_CHANNELS: 256 RPN: USE_FPN: True ANCHOR_STRIDE: (4, 8, 16, 32, 64) PRE_NMS_TOP_N_TRAIN: 4000 FPN_POST_NMS_TOP_N_TRAIN: 8000 PRE_NMS_TOP_N_TEST: 3000 POST_NMS_TOP_N_TEST: 3000 FPN_POST_NMS_TOP_N_TEST: 30 ROI_HEADS: USE_FPN: True ROI_BOX_HEAD: POOLER_RESOLUTION: 7 POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125) POOLER_SAMPLING_RATIO: 2 FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor" PREDICTOR: "FPNPredictor" NUM_CLASSES: 2 ROI_MASK_HEAD: POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125) FEATURE_EXTRACTOR: "MaskRCNNFPNFeatureExtractor" PREDICTOR: "MaskRCNNC4Predictor" POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 2 RESOLUTION: 28 SHARE_BOX_FEATURE_EXTRACTOR: False MASK_ON: True DATASETS: TRAIN: ("coco_2014_train", ) TEST: ("coco_2014_val",) DATALOADER: SIZE_DIVISIBILITY: 32 SOLVER: BASE_LR: 0.0001 WEIGHT_DECAY: 0.0001 STEPS: (36000, 48000) MAX_ITER: 54000 IMS_PER_BATCH: 2 TEST_PERIOD: 2500 CHECKPOINT_PERIOD: 2500 WARMUP_ITERS: 10000 WARMUP_FACTOR: 0.0001 INPUT: MIN_SIZE_TRAIN: (400, ) MIN_SIZE_TEST: 400 OUTPUT_DIR: "gan_output" TEST: IMS_PER_BATCH: 2 2021-01-15 18:23:16,919 maskrcnn_benchmark INFO: Running with config: AMP_VERBOSE: False DATALOADER: ASPECT_RATIO_GROUPING: True NUM_WORKERS: 4 SIZE_DIVISIBILITY: 32 DATASETS: TEST: ('coco_2014_val',) TRAIN: ('coco_2014_train',) DTYPE: float32 INPUT: BRIGHTNESS: 0.0 CONTRAST: 0.0 HUE: 0.0 MAX_SIZE_TEST: 1333 MAX_SIZE_TRAIN: 1333 MIN_SIZE_TEST: 400 MIN_SIZE_TRAIN: (400,) PIXEL_MEAN: [102.9801, 115.9465, 122.7717] PIXEL_STD: [1.0, 1.0, 1.0] SATURATION: 0.0 TO_BGR255: True VERTICAL_FLIP_PROB_TRAIN: 0.0 MODEL: BACKBONE: CONV_BODY: R-101-FPN FREEZE_CONV_BODY_AT: 2 CLS_AGNOSTIC_BBOX_REG: False DEVICE: cuda FBNET: ARCH: default ARCH_DEF: BN_TYPE: bn DET_HEAD_BLOCKS: [] DET_HEAD_LAST_SCALE: 1.0 DET_HEAD_STRIDE: 0 DW_CONV_SKIP_BN: True DW_CONV_SKIP_RELU: True KPTS_HEAD_BLOCKS: [] KPTS_HEAD_LAST_SCALE: 0.0 KPTS_HEAD_STRIDE: 0 MASK_HEAD_BLOCKS: [] MASK_HEAD_LAST_SCALE: 0.0 MASK_HEAD_STRIDE: 0 RPN_BN_TYPE: RPN_HEAD_BLOCKS: 0 SCALE_FACTOR: 1.0 WIDTH_DIVISOR: 1 FPN: USE_GN: False USE_RELU: False GROUP_NORM: DIM_PER_GP: -1 EPSILON: 1e-05 NUM_GROUPS: 32 KEYPOINT_ON: False MASK_ON: True META_ARCHITECTURE: GeneralizedRCNN RESNETS: BACKBONE_OUT_CHANNELS: 256 DEFORMABLE_GROUPS: 1 NUM_GROUPS: 1 RES2_OUT_CHANNELS: 256 RES5_DILATION: 1 STAGE_WITH_DCN: (False, False, False, False) STEM_FUNC: StemWithFixedBatchNorm STEM_OUT_CHANNELS: 64 STRIDE_IN_1X1: True TRANS_FUNC: BottleneckWithFixedBatchNorm WIDTH_PER_GROUP: 64 WITH_MODULATED_DCN: False RETINANET: ANCHOR_SIZES: (32, 64, 128, 256, 512) ANCHOR_STRIDES: (8, 16, 32, 64, 128) ASPECT_RATIOS: (0.5, 1.0, 2.0) BBOX_REG_BETA: 0.11 BBOX_REG_WEIGHT: 4.0 BG_IOU_THRESHOLD: 0.4 FG_IOU_THRESHOLD: 0.5 INFERENCE_TH: 0.05 LOSS_ALPHA: 0.25 LOSS_GAMMA: 2.0 NMS_TH: 0.4 NUM_CLASSES: 81 NUM_CONVS: 4 OCTAVE: 2.0 PRE_NMS_TOP_N: 1000 PRIOR_PROB: 0.01 SCALES_PER_OCTAVE: 3 STRADDLE_THRESH: 0 USE_C5: True RETINANET_ON: False ROI_BOX_HEAD: CONV_HEAD_DIM: 256 DILATION: 1 FEATURE_EXTRACTOR: FPN2MLPFeatureExtractor MLP_HEAD_DIM: 1024 NUM_CLASSES: 2 NUM_STACKED_CONVS: 4 POOLER_RESOLUTION: 7 POOLER_SAMPLING_RATIO: 2 POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125) PREDICTOR: FPNPredictor USE_GN: False ROI_HEADS: BATCH_SIZE_PER_IMAGE: 512 BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0) BG_IOU_THRESHOLD: 0.5 DETECTIONS_PER_IMG: 100 FG_IOU_THRESHOLD: 0.5 NMS: 0.5 POSITIVE_FRACTION: 0.25 SCORE_THRESH: 0.05 USE_FPN: True ROI_KEYPOINT_HEAD: CONV_LAYERS: (512, 512, 512, 512, 512, 512, 512, 512) FEATURE_EXTRACTOR: KeypointRCNNFeatureExtractor MLP_HEAD_DIM: 1024 NUM_CLASSES: 17 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_SCALES: (0.0625,) PREDICTOR: KeypointRCNNPredictor RESOLUTION: 14 SHARE_BOX_FEATURE_EXTRACTOR: True ROI_MASK_HEAD: CONV_LAYERS: (256, 256, 256, 256) DILATION: 1 FEATURE_EXTRACTOR: MaskRCNNFPNFeatureExtractor MLP_HEAD_DIM: 1024 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 2 POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125) POSTPROCESS_MASKS: False POSTPROCESS_MASKS_THRESHOLD: 0.5 PREDICTOR: MaskRCNNC4Predictor RESOLUTION: 28 SHARE_BOX_FEATURE_EXTRACTOR: False USE_GN: False RPN: ANCHOR_SIZES: (32, 64, 128, 256, 512) ANCHOR_STRIDE: (4, 8, 16, 32, 64) ASPECT_RATIOS: (0.5, 1.0, 2.0) BATCH_SIZE_PER_IMAGE: 256 BG_IOU_THRESHOLD: 0.3 FG_IOU_THRESHOLD: 0.7 FPN_POST_NMS_PER_BATCH: True FPN_POST_NMS_TOP_N_TEST: 30 FPN_POST_NMS_TOP_N_TRAIN: 8000 MIN_SIZE: 0 NMS_THRESH: 0.7 POSITIVE_FRACTION: 0.5 POST_NMS_TOP_N_TEST: 3000 POST_NMS_TOP_N_TRAIN: 2000 PRE_NMS_TOP_N_TEST: 3000 PRE_NMS_TOP_N_TRAIN: 4000 RPN_HEAD: SingleConvRPNHead STRADDLE_THRESH: 0 USE_FPN: True RPN_ONLY: False WEIGHT: OUTPUT_DIR: gan_output PATHS_CATALOG: /media/vip/3e77cb89-a384-4659-8d70-e9ce8dc3a977/Donglihui/GAN_Mask-RCNN/maskrcnn-benchmark/maskrcnn_benchmark/config/paths_catalog.py SOLVER: BASE_LR: 0.0001 BIAS_LR_FACTOR: 2 CHECKPOINT_PERIOD: 2500 GAMMA: 0.1 IMS_PER_BATCH: 2 MAX_ITER: 54000 MOMENTUM: 0.9 STEPS: (36000, 48000) TEST_PERIOD: 2500 WARMUP_FACTOR: 0.0001 WARMUP_ITERS: 10000 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0001 WEIGHT_DECAY_BIAS: 0 TEST: BBOX_AUG: ENABLED: False H_FLIP: False MAX_SIZE: 4000 SCALES: () SCALE_H_FLIP: False DETECTIONS_PER_IMG: 100 EXPECTED_RESULTS: [] EXPECTED_RESULTS_SIGMA_TOL: 4 IMS_PER_BATCH: 2 2021-01-15 18:23:16,920 maskrcnn_benchmark INFO: Saving config into: gan_output/config.yml Selected optimization level O0: Pure FP32 training.

Defaults for this optimization level are: enabled : True opt_level : O0 cast_model_type : torch.float32 patch_torch_functions : False keep_batchnorm_fp32 : None master_weights : False loss_scale : 1.0 Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O0 cast_model_type : torch.float32 patch_torch_functions : False keep_batchnorm_fp32 : None master_weights : False loss_scale : 1.0 loading annotations into memory... Selected optimization level O0: Pure FP32 training.

Defaults for this optimization level are: enabled : True opt_level : O0 cast_model_type : torch.float32 patch_torch_functions : False keep_batchnorm_fp32 : None master_weights : False loss_scale : 1.0 Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O0 cast_model_type : torch.float32 patch_torch_functions : False keep_batchnorm_fp32 : None master_weights : False loss_scale : 1.0 Done (t=0.08s) creating index... index created! loading annotations into memory... 2021-01-15 18:23:17,943 maskrcnn_benchmark.utils.checkpoint INFO: No checkpoint found. Initializing model from scratch loading annotations into memory... Done (t=0.02s) creating index... index created! Done (t=0.09s) creating index... index created! 2021-01-15 18:23:18,042 maskrcnn_benchmark.utils.miscellaneous INFO: Saving labels mapping into gan_output/labels.json loading annotations into memory... Done (t=0.02s) creating index... index created! 2021-01-15 18:23:18,065 maskrcnn_benchmark.trainer INFO: Start training Traceback (most recent call last): File "tools/train.py", line 995, in main() File "tools/train.py", line 992, in main model = train(cfg, args.local_rank, args.distributed, args.d_ckpt) File "tools/train.py", line 806, in train g_loss_dict, d_loss_dict = g_rcnn(images, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 442, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "tools/train.py", line 512, in forward outputs = self.Gnet(images, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd **applier(kwargs, input_caster)) File "tools/train.py", line 474, in forward x, result, detector_losses = self.roi_heads(features, proposals, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/vip/3e77cb89-a384-4659-8d70-e9ce8dc3a977/Donglihui/GAN_Mask-RCNN/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 39, in forward x, detections, loss_mask = self.mask(mask_features, detections, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/vip/3e77cb89-a384-4659-8d70-e9ce8dc3a977/Donglihui/GAN_Mask-RCNN/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/mask_head/mask_head.py", line 78, in forward loss_mask = self.loss_evaluator(proposals, mask_logits, targets) File "tools/train.py", line 224, in call mask_logits[positive_inds, labels_pos], mask_targets File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/functional.py", line 2098, in binary_cross_entropy_with_logits raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size())) ValueError: Target size (torch.Size([102, 28, 28])) must be the same as input size (torch.Size([102, 48, 48])) Traceback (most recent call last): File "tools/train.py", line 995, in main() File "tools/train.py", line 992, in main model = train(cfg, args.local_rank, args.distributed, args.d_ckpt) File "tools/train.py", line 806, in train g_loss_dict, d_loss_dict = g_rcnn(images, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 442, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "tools/train.py", line 512, in forward outputs = self.Gnet(images, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 197, in new_fwd **applier(kwargs, input_caster)) File "tools/train.py", line 474, in forward x, result, detector_losses = self.roi_heads(features, proposals, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/vip/3e77cb89-a384-4659-8d70-e9ce8dc3a977/Donglihui/GAN_Mask-RCNN/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/roi_heads.py", line 39, in forward x, detections, loss_mask = self.mask(mask_features, detections, targets) File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/vip/3e77cb89-a384-4659-8d70-e9ce8dc3a977/Donglihui/GAN_Mask-RCNN/maskrcnn-benchmark/maskrcnn_benchmark/modeling/roi_heads/mask_head/mask_head.py", line 78, in forward loss_mask = self.loss_evaluator(proposals, mask_logits, targets) File "tools/train.py", line 224, in call mask_logits[positive_inds, labels_pos], mask_targets File "/home/vip/anaconda/envs/maskrcnn_benchmark/lib/python3.6/site-packages/torch/nn/functional.py", line 2098, in binary_cross_entropy_with_logits raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size())) ValueError: Target size (torch.Size([128, 28, 28])) must be the same as input size (torch.Size([128, 48, 48]))`

Any suggested quick fixes?

Thanks!

Jan 15 '21 10:01 dlh199799

I encountered the same problem as yours. Do you have a solution to it? @dlh199799

Jan 27 '21 12:01 weicrane

Sorry, I haven't solved it yet. I've given up @weicrane

Jan 27 '21 12:01 dlh199799

@weicrane 是国人呀，留个联系方式交流一下吧

Jan 27 '21 12:01 dlh199799

Okkk, thank you. If I find a solution, I will write an answer here. @dlh199799

Jan 27 '21 13:01 weicrane

Well, you can solve this issue by changing the following in gan_mrcnn.py self.roi_heads.mask.feature_extractor.pooler = AdaptivePooler( output_size=(24, 24), scales=(0.25, 0.125, 0.0625, 0.03125), sampling_ratio=2, ) to self.roi_heads.mask.feature_extractor.pooler = AdaptivePooler( output_size=(14, 14), scales=(0.25, 0.125, 0.0625, 0.03125), sampling_ratio=2, ) but that will introduce other issues in the network down the line, particularly with the kernal size. I'm rewriting the entire code to fix the bugs and upgrade the library. The one provided in this repo simply does not work and with the authors not maintaining the repo is difficult to debug.

Mar 24 '21 14:03 AloshkaD

Indeed as you @AloshkaD . I tried to modify the parameter of AdaptivePooler to 14*14, and I would encounter the problem of the kernel size of the convolutional layer. Then I tried to modify the Conv2d at convblock1 to not change the size and modify the channel, but which would cause empty tensor to appear soon after the training started, and training failed finally. Looking forward to your restoration of this repo. Thank you very much.

Mar 24 '21 15:03 weicrane

Is there a test file for outputting recognition results here?

Oct 22 '22 04:10 Zoysia3689

GAN_Mask-RCNN GAN_Mask-RCNN copied to clipboard

ValueError: Target size(torch.size([246,28,28])) must be the same as input as input size (torch.size([246,48,48]))

GAN_Mask-RCNN
GAN_Mask-RCNN copied to clipboard