DocShadow-SD7K ValueError: Requested crop size (2063, 2452) is larger than the image size (3690, 2443) (on compressed SD7K)

Dear authors, thanks for sharing the SD7K dataset! I have downloaded the compressed version of the data, but when run python train.py with the default config file, I see ValueError:

ValueError: Requested crop size (2063, 2452) is larger than the image size (3690, 2443)

I did not change any other settings when run the train.py script. I can run the script using Jung dataset without error.

Here are the full traceback:

Traceback (most recent call last):
  File "train.py", line 132, in <module>
    train()
  File "train.py", line 65, in train
    for i, data in enumerate(tqdm(trainloader, disable=not accelerator.is_local_main_process)):
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/accelerate/data_loader.py", line 462, in __iter__
    next_batch = next(dataloader_iter)
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/_utils.py", line 705, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/sensei-fs/users/bxiao/DocShadow-SD7K/data/dataset_RGB.py", line 61, in __getitem__
    transformed = self.transform(image=inp_img, target=tar_img)
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/core/composition.py", line 210, in __call__
    data = t(force_apply=force_apply, **data)
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/core/transforms_interface.py", line 97, in __call__
    return self.apply_with_params(params, **kwargs)
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/core/transforms_interface.py", line 112, in apply_with_params
    res[key] = target_function(arg, **dict(params, **target_dependencies))
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/augmentations/crops/transforms.py", line 246, in apply
    crop = F.random_crop(img, crop_height, crop_width, h_start, w_start)
  File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/augmentations/crops/functional.py", line 25, in random_crop
    raise ValueError(
ValueError: Requested crop size (2063, 2452) is larger than the image size (3690, 2443)

I was wondering if you have any suggestions on this issue?

May 02 '24 00:05 baicenxiao

What is the W and H in your config yml file, the default value of width and height should be 512.

May 02 '24 03:05 xuhangc

This crop size looks unusual.

May 02 '24 03:05 zinuoli

Here are the settings in config file:

# Optimization arguments.
OPTIM:
  BATCH_SIZE: 1
  NUM_EPOCHS: 300
  LR_INITIAL: 2e-4
  LR_MIN: 1e-6
  SEED: 3407
  WANDB: False

TRAINING:
  VAL_AFTER_EVERY: 1
  RESUME: False
  PS_W: 512
  PS_H: 512
  TRAIN_DIR: '/bxiao/DocShadow-SD7K/data/train/' # path to training data
  VAL_DIR: '/bxiao/DocShadow-SD7K/data/test/'    # path to validation data
  SAVE_DIR: './checkpoints/'     # path to save models and images
  ORI: False

I used the default value (512, 512) for width and height. It works for both Jung and Kligler dataset, but raised that error on the SD7K data (9G version).

One observation: if changing the RadomResizedCrop in this line to RadomCrop, then it runs without error. But this change provides less diversity in data augmentation and therefore may not be proper.

May 02 '24 04:05 baicenxiao

Changinig this line is indeed a way to fix your problem, you can try setting A.RandomResizedCrop(height=512, width=512) and see what will happend.

We control the crop size by PS_W and PS_H, there must be something revision that causes these two parameters cannot go here. I am not sure if you revise something but they are the same, so don't worry about the diversity in data augmentation.

May 02 '24 05:05 zinuoli

Thanks @zinuoli for the reply. In case I revised something by accident, I deleted and cloned the repo again just now. But I still got the same error on the SD7K data (9G version). So it is unlikely to be the case something in the code was changed from the original repo.

Another observation:

Unifying the image size before other augmentation by adding A.Resize(height=3699, width=2462) also makes the train.py works on SD7K data

self.transform = A.Compose([
            A.Resize(height=3699, width=2462),
            A.Flip(p=0.3),
            A.RandomRotate90(p=0.3),
            A.ColorJitter(p=0.3),
            A.Affine(p=0.3),
            A.RandomResizedCrop(height=img_options['h'], width=img_options['w']), ],
            additional_targets={
                'target': 'image',
            }
        )

May 02 '24 05:05 baicenxiao

Yes it is another way to solve. Could you just change A.RandomResizedCrop(height=img_options['h'], width=img_options['w']) to A.RandomResizedCrop(height=512, width=512) and see what happend?

May 02 '24 06:05 zinuoli

Just tried A.RandomResizedCrop(height=512, width=512). It gives the same error.

May 02 '24 06:05 baicenxiao

Seems strange...let me try and get back to you.

May 02 '24 06:05 zinuoli

Hi, I have tested on my server, I don't have any problem here. Which version of albumentations are you using? According to requirements.txt, the version should be 1.3.0.

May 02 '24 06:05 zinuoli

The version of albumentations on my server is 1.3.0

May 02 '24 14:05 baicenxiao

That's strange, I did't encounter any problem you mentioned above, I can ensure that our codebase has no bugs. I indeed met an issue when I used version 1.4.4, after I switched to 1.3.0, everything became fine.

I think in this case you may need to revise codes based on the issue you have.

May 02 '24 14:05 zinuoli

DocShadow-SD7K DocShadow-SD7K copied to clipboard

ValueError: Requested crop size (2063, 2452) is larger than the image size (3690, 2443) (on compressed SD7K)

DocShadow-SD7K
DocShadow-SD7K copied to clipboard