mmsegmentation RuntimeError: stack expects each tensor to be equal size, but got [1, 677, 347] at entry 0 and [1, 512, 512] at entry 1

Traceback (most recent call last): File "train.py", line 104, in main() File "train.py", line 100, in main runner.train() File "/home/lyu-03/.conda/envs/lx_torch/lib/python3.7/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/home/lyu-03/.conda/envs/lx_torch/lib/python3.7/site-packages/mmengine/runner/loops.py", line 286, in run self.run_iter(data_batch) File "/home/lyu-03/.conda/envs/lx_torch/lib/python3.7/site-packages/mmengine/runner/loops.py", line 310, in run_iter data_batch, optim_wrapper=self.runner.optim_wrapper) File "/home/lyu-03/.conda/envs/lx_torch/lib/python3.7/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step losses = self._run_forward(data, mode='loss') # type: ignore File "/home/lyu-03/.conda/envs/lx_torch/lib/python3.7/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward results = self(**data, mode=mode) File "/home/lyu-03/.conda/envs/lx_torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/lyu-03/lx_wp/mmsegmentation-main/mmsegmentation-main/mmseg/models/segmentors/base.py", line 94, in forward return self.loss(inputs, data_samples) File "/home/lyu-03/lx_wp/mmsegmentation-main/mmsegmentation-main/mmseg/models/segmentors/encoder_decoder.py", line 178, in loss loss_decode = self._decode_head_forward_train(x, data_samples) File "/home/lyu-03/lx_wp/mmsegmentation-main/mmsegmentation-main/mmseg/models/segmentors/encoder_decoder.py", line 140, in _decode_head_forward_train self.train_cfg) File "/home/lyu-03/lx_wp/mmsegmentation-main/mmsegmentation-main/mmseg/models/decode_heads/decode_head.py", line 262, in loss losses = self.loss_by_feat(seg_logits, batch_data_samples) File "/home/lyu-03/lx_wp/mmsegmentation-main/mmsegmentation-main/mmseg/models/decode_heads/decode_head.py", line 305, in loss_by_feat seg_label = self._stack_batch_gt(batch_data_samples) File "/home/lyu-03/lx_wp/mmsegmentation-main/mmsegmentation-main/mmseg/models/decode_heads/decode_head.py", line 289, in _stack_batch_gt return torch.stack(gt_semantic_segs, dim=0) RuntimeError: stack expects each tensor to be equal size, but got [1, 677, 347] at entry 0 and [1, 512, 512] at entry 1

Mar 22 '24 15:03 apsthe

@apsthe I think you should check if the size of the Image Train and Annotation image are the same, for example, if the Image Train is portrait but the Annotation is landscape. And this relation with RandomResize and RandomCrop in the configs/_base_/datasets/xxx.py file

Mar 23 '24 14:03 monxarat

@apsthe You encounter this error because of how the padding function is designed. When you prepare the data pipeline in mmsegmentation/configs/base/datasets/your_dataset.py you use the RandomResize method. You need to specife the parameters scale and ratio_range. Let's consider the case where you provide a tuple for both and let's say you define the size of your input images to be 512 x 512. This input size will be the base size and the ratio range will have a upper and lower limit (let's agree on 0.5 - 2.0) of the original size of the input image. Then you need to set scale accordingly. I.e. scale=(1024, 256). Once these values agree the padding function for the image resizing will make sure that all images have the same size and you will no longer encouter this error.

Jun 10 '24 07:06 NiklasDHahn

Do you have any solutions now？

Jun 15 '24 05:06 Saillxl

@NiklasDHahn What should I do if my dataset image sizes are inconsistent?Thanks！ dict( keep_ratio=True, ratio_range=( 0.5, 2.0, ), scale=( 256, 1024, ), type='RandomResize'), dict( cat_max_ratio=0.75, crop_size=( 512, 512, ), type='RandomCrop'),

Jun 15 '24 05:06 Saillxl

the reason is that the size of input image is different from annotation image size

Sep 17 '24 11:09 ZhimingNJ

'RandomResize' will got RuntimeError: stack expects each tensor to be equal size, but got [1, 512, 512] at entry 0 and [1, 512, 527] at entry 5. So I changed it to Resize. It works.

crop_size = (512, 512)
train_pipeline = [
    dict(type="LoadImageFromFile"),
    dict(type="LoadAnnotations", reduce_zero_label=False),
    # dict(
    #     type='RandomResize',
    #     scale=(2048, 512),
    #     ratio_range=(0.5, 2.0),
    #     keep_ratio=True),
    dict(
        type="Resize",
        scale=(2048, 512),
        keep_ratio=True,
    ),
    dict(type="RandomCrop", crop_size=crop_size, cat_max_ratio=0.75),
    dict(type="RandomFlip", prob=0.5),
    dict(type="PhotoMetricDistortion"),
    dict(type="PackSegInputs"),
]

Jun 11 '25 08:06 liuwake