tutorials
tutorials copied to clipboard
Auto3DSeg tutorial padding error caused by the swinunetr template
Describe the bug
The swinunetr template in Auto3DSeg would fail when it's running on the MSD Task05_Prostate datasets. Specifically, the SpatialPadd gave an error as below
[2022-09-24T19:15:57.111Z] CalledProcessError: Command '['python', '/home/jenkins/agent/workspace/Monai-notebooks/tuts/tutorials/auto3dseg/notebooks/auto3dseg_work_dir/swinunetr_2/scripts/train.py', 'run', "--config_file='/home/jenkins/agent/workspace/Monai-notebooks/tuts/tutorials/auto3dseg/notebooks/auto3dseg_work_dir/swinunetr_2/configs/transforms_validate.yaml','/home/jenkins/agent/workspace/Monai-notebooks/tuts/tutorials/auto3dseg/notebooks/auto3dseg_work_dir/swinunetr_2/configs/transforms_train.yaml','/home/jenkins/agent/workspace/Monai-notebooks/tuts/tutorials/auto3dseg/notebooks/auto3dseg_work_dir/swinunetr_2/configs/network.yaml','/home/jenkins/agent/workspace/Monai-notebooks/tuts/tutorials/auto3dseg/notebooks/auto3dseg_work_dir/swinunetr_2/configs/hyper_parameters.yaml','/home/jenkins/agent/workspace/Monai-notebooks/tuts/tutorials/auto3dseg/notebooks/auto3dseg_work_dir/swinunetr_2/configs/transforms_infer.yaml'", '--num_iterations=8', '--num_iterations_per_validation=4', '--num_images_per_batch=2', '--num_epochs=2', '--num_warmup_iterations=4']' died with <Signals.SIGABRT: 6>.
[2022-09-24T19:15:57.111Z]
[2022-09-24T19:15:57.111Z] The above exception was the direct cause of the following exception:
[2022-09-24T19:15:57.111Z]
[2022-09-24T19:15:57.111Z] RuntimeError Traceback (most recent call last)
[2022-09-24T19:15:57.111Z] Input In [8], in <cell line: 1>()
[2022-09-24T19:15:57.111Z] ----> 1 runner.run()
[2022-09-24T19:15:57.111Z]
[2022-09-24T19:15:57.111Z] File /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/apps/auto3dseg/auto_runner.py:586, in AutoRunner.run(self)
[2022-09-24T19:15:57.111Z] 584 history = import_bundle_algo_history(self.work_dir, only_trained=False)
[2022-09-24T19:15:57.111Z] 585 if not self.hpo:
[2022-09-24T19:15:57.111Z] --> 586 self._train_algo_in_sequence(history)
[2022-09-24T19:15:57.111Z] 587 else:
[2022-09-24T19:15:57.111Z] 588 self._train_algo_in_nni(history)
[2022-09-24T19:15:57.111Z]
[2022-09-24T19:15:57.111Z] File /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/apps/auto3dseg/auto_runner.py:488, in AutoRunner._train_algo_in_sequence(self, history)
[2022-09-24T19:15:57.111Z] 486 for task in history:
[2022-09-24T19:15:57.111Z] 487 for _, algo in task.items():
[2022-09-24T19:15:57.111Z] --> 488 algo.train(self.train_params)
[2022-09-24T19:15:57.111Z] 489 acc = algo.get_score()
[2022-09-24T19:15:57.111Z] 490 algo_to_pickle(algo, template_path=algo.template_path, best_metrics=acc)
[2022-09-24T19:15:57.111Z]
[2022-09-24T19:15:57.111Z] File /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/apps/auto3dseg/bundle_gen.py:200, in BundleAlgo.train(self, train_params)
[2022-09-24T19:15:57.111Z] 192 """
[2022-09-24T19:15:57.111Z] 193 Load the run function in the training script of each model. Training parameter is predefined by the
[2022-09-24T19:15:57.111Z] 194 algo_config.yaml file, which is pre-filled by the fill_template_config function in the same instance.
[2022-09-24T19:15:57.111Z] (...)
[2022-09-24T19:15:57.111Z] 197 train_params: to specify the devices using a list of integers: ``{"CUDA_VISIBLE_DEVICES": [1,2,3]}``.
[2022-09-24T19:15:57.111Z] 198 """
[2022-09-24T19:15:57.111Z] 199 cmd, devices_info = self._create_cmd(train_params)
[2022-09-24T19:15:57.111Z] --> 200 return self._run_cmd(cmd, devices_info)
[2022-09-24T19:15:57.111Z]
[2022-09-24T19:15:57.111Z] File /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/apps/auto3dseg/bundle_gen.py:188, in BundleAlgo._run_cmd(self, cmd, devices_info)
[2022-09-24T19:15:57.112Z] 186 output = repr(e.stdout).replace("\\n", "\n").replace("\\t", "\t")
[2022-09-24T19:15:57.112Z] 187 errors = repr(e.stderr).replace("\\n", "\n").replace("\\t", "\t")
[2022-09-24T19:15:57.112Z] --> 188 raise RuntimeError(f"subprocess call error {e.returncode}: {errors}, {output}") from e
[2022-09-24T19:15:57.112Z] 189 return normal_out
[2022-09-24T19:15:57.112Z]
[2022-09-24T19:15:57.112Z] RuntimeError: subprocess call error -6: b'Modifying image pixdim from [0.625 0.625 3.6 1. ] to [ 0.625 0.625 3.5999999 160.84918933]
[2022-09-24T19:15:57.112Z] Modifying image pixdim from [0.625 0.625 3.6 1. ] to [ 0.625 0.625 3.5999999 170.38896694]
[2022-09-24T19:15:57.112Z] Modifying image pixdim from [0.6249998 0.625 3.5999987 1. ] to [ 0.62499983 0.625 3.59999877 153.34152766]
[2022-09-24T19:15:57.112Z] Modifying image pixdim from [0.625 0.625 3.60001 1. ] to [ 0.625 0.625 3.60000992 151.21210251]
[2022-09-24T19:15:57.112Z] Modifying image pixdim from [0.6 0.5999997 3.999998 1. ] to [ 0.60000002 0.59999975 3.99999799 117.82332812]
[2022-09-24T19:15:57.112Z] Modifying image pixdim from [0.625 0.625 3.6 1. ] to [ 0.625 0.625 3.5999999 152.3814254]
[2022-09-24T19:15:57.112Z] Traceback (most recent call last):
[2022-09-24T19:15:57.112Z] File "/home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/transforms/croppad/array.py", line 184, in __call__
[2022-09-24T19:15:57.112Z] out = _pad(img_t, pad_width=to_pad_, mode=mode_, **kwargs_)
[2022-09-24T19:15:57.112Z] File "/home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/transforms/croppad/array.py", line 138, in _pt_pad
[2022-09-24T19:15:57.112Z] return pad_pt(img.unsqueeze(0), pt_pad_width, mode=mode, **kwargs).squeeze(0)
[2022-09-24T19:15:57.112Z] File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 4170, in _pad
[2022-09-24T19:15:57.112Z] return handle_torch_function(_pad, (input,), input, pad, mode=mode, value=value)
[2022-09-24T19:15:57.112Z] File "/opt/conda/lib/python3.8/site-packages/torch/overrides.py", line 1355, in handle_torch_function
[2022-09-24T19:15:57.112Z] result = torch_func_method(public_api, types, args, kwargs)
[2022-09-24T19:15:57.112Z] File "/home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/data/meta_tensor.py", line 249, in __torch_function__
[2022-09-24T19:15:57.112Z] ret = super().__torch_function__(func, types, args, kwargs)
[2022-09-24T19:15:57.112Z] File "/opt/conda/lib/python3.8/site-packages/torch/_tensor.py", line 1051, in __torch_function__
[2022-09-24T19:15:57.112Z] ret = func(*args, **kwargs)
[2022-09-24T19:15:57.112Z] File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 4199, in _pad
[2022-09-24T19:15:57.112Z] return torch._C._nn.reflection_pad3d(input, pad)
[2022-09-24T19:15:57.112Z] RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (22, 22) at dimension 4 of input [1, 2, 320, 320, 20]
Hi @mingxin-zheng , thanks for raising this. I saw this is related to SwinUNETR with Task05 right? This meets the prior test cases conditions. The prior research contribution has a spacing transform to 0.2 or 0.5 to avoid padding issues. Do we have an idea on spacing/crop/padding on Task05 with SwinUNETR at this time? Thanks.
Hi @tangy5 , this is a follow-up ticket of #952 . My overall plan would be solving the CI/CD blocker there by switching all example cases from Task05 to Task09, and then look into the spacing for Task05 .
My initial thought would be utilizing the datastats from Auto3DSeg data analyzer and find the right spatial transform. How do you determine the right number (0.2/0.5) in the spacing transform right now?
@mingxin-zheng . Using Task09 for tutorial makes a lot of sense. Thank you.
For Task05, the prior research contribution uses 0.5 for spacing.
The complete transformation designed for Task05 is as below from last year:
train_transforms = Compose(
[
LoadImaged(keys=["image", "label"]),
AsChannelFirstd(keys="image"),
AddChanneld(keys=["label"]),
# ConvertToMultiChannelBasedOnBratsClassesd(keys="label"),
Orientationd(keys=["image", "label"], axcodes="RAS"),
Spacingd(keys=["image", "label"], pixdim=(0.5, 0.5, 0.5), mode=("bilinear", "nearest")),
SpatialPadd(keys=["image", "label"], spatial_size=[96, 96, 96]),
RandSpatialCropd(keys=["image", "label"], roi_size=[96, 96, 96], random_size=False),
RandFlipd(keys=["image", "label"], prob=0.5, spatial_axis=0),
NormalizeIntensityd(keys="image", nonzero=True, channel_wise=True),
RandScaleIntensityd(keys="image", factors=0.1, prob=0.5),
RandShiftIntensityd(keys="image", offsets=0.1, prob=0.5),
RandAffined(
keys=["image", "label"],
mode=("bilinear", "nearest"),
prob=0.1,
shear_range=(0,0,0),
translate_range=(0, 0, 0),
rotate_range=(0, 0, np.pi),
scale_range=(0.3, 0.3, 0),
padding_mode="zeros",
as_tensor_output=False
),
RandRotate90d(
keys=["image", "label"],
prob=0.10,
max_k=3,
),
RandShiftIntensityd(
keys=["image"],
offsets=0.10,
prob=0.50,
),
RandScaleIntensityd(
keys=["image"],
factors=0.25,
prob=0.5,
),
RandAdjustContrastd(keys=["image"],gamma=(0.7, 1.5),prob=0.5),
ToTensord(keys=["image", "label"]),
]
Closing this issue as the updated algorithm templates have resolved it.