DocShadow-SD7K
DocShadow-SD7K copied to clipboard
ValueError: Requested crop size (2063, 2452) is larger than the image size (3690, 2443) (on compressed SD7K)
Dear authors, thanks for sharing the SD7K dataset! I have downloaded the compressed version of the data, but when run python train.py
with the default config file, I see ValueError:
ValueError: Requested crop size (2063, 2452) is larger than the image size (3690, 2443)
I did not change any other settings when run the train.py
script. I can run the script using Jung dataset without error.
Here are the full traceback:
Traceback (most recent call last):
File "train.py", line 132, in <module>
train()
File "train.py", line 65, in train
for i, data in enumerate(tqdm(trainloader, disable=not accelerator.is_local_main_process)):
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/tqdm/std.py", line 1181, in __iter__
for obj in iterable:
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/accelerate/data_loader.py", line 462, in __iter__
next_batch = next(dataloader_iter)
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
data = self._next_data()
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/_utils.py", line 705, in reraise
raise exception
ValueError: Caught ValueError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/sensei-fs/users/bxiao/DocShadow-SD7K/data/dataset_RGB.py", line 61, in __getitem__
transformed = self.transform(image=inp_img, target=tar_img)
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/core/composition.py", line 210, in __call__
data = t(force_apply=force_apply, **data)
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/core/transforms_interface.py", line 97, in __call__
return self.apply_with_params(params, **kwargs)
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/core/transforms_interface.py", line 112, in apply_with_params
res[key] = target_function(arg, **dict(params, **target_dependencies))
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/augmentations/crops/transforms.py", line 246, in apply
crop = F.random_crop(img, crop_height, crop_width, h_start, w_start)
File "/opt/conda/envs/BGShadowNet/lib/python3.8/site-packages/albumentations/augmentations/crops/functional.py", line 25, in random_crop
raise ValueError(
ValueError: Requested crop size (2063, 2452) is larger than the image size (3690, 2443)
I was wondering if you have any suggestions on this issue?
What is the W and H in your config yml file, the default value of width and height should be 512.
This crop size looks unusual.
Here are the settings in config file:
# Optimization arguments.
OPTIM:
BATCH_SIZE: 1
NUM_EPOCHS: 300
LR_INITIAL: 2e-4
LR_MIN: 1e-6
SEED: 3407
WANDB: False
TRAINING:
VAL_AFTER_EVERY: 1
RESUME: False
PS_W: 512
PS_H: 512
TRAIN_DIR: '/bxiao/DocShadow-SD7K/data/train/' # path to training data
VAL_DIR: '/bxiao/DocShadow-SD7K/data/test/' # path to validation data
SAVE_DIR: './checkpoints/' # path to save models and images
ORI: False
I used the default value (512, 512) for width and height. It works for both Jung and Kligler dataset, but raised that error on the SD7K data (9G version).
One observation: if changing the RadomResizedCrop in this line to RadomCrop, then it runs without error. But this change provides less diversity in data augmentation and therefore may not be proper.
Changinig this line is indeed a way to fix your problem, you can try setting A.RandomResizedCrop(height=512, width=512)
and see what will happend.
We control the crop size by PS_W
and PS_H
, there must be something revision that causes these two parameters cannot go here. I am not sure if you revise something but they are the same, so don't worry about the diversity in data augmentation.
Thanks @zinuoli for the reply. In case I revised something by accident, I deleted and cloned the repo again just now. But I still got the same error on the SD7K data (9G version). So it is unlikely to be the case something in the code was changed from the original repo.
Another observation:
Unifying the image size before other augmentation by adding A.Resize(height=3699, width=2462)
also makes the train.py
works on SD7K data
self.transform = A.Compose([
A.Resize(height=3699, width=2462),
A.Flip(p=0.3),
A.RandomRotate90(p=0.3),
A.ColorJitter(p=0.3),
A.Affine(p=0.3),
A.RandomResizedCrop(height=img_options['h'], width=img_options['w']), ],
additional_targets={
'target': 'image',
}
)
Yes it is another way to solve. Could you just change A.RandomResizedCrop(height=img_options['h'], width=img_options['w'])
to A.RandomResizedCrop(height=512, width=512)
and see what happend?
Just tried A.RandomResizedCrop(height=512, width=512)
. It gives the same error.
Seems strange...let me try and get back to you.
Hi, I have tested on my server, I don't have any problem here. Which version of albumentations
are you using? According to requirements.txt
, the version should be 1.3.0.
The version of albumentations
on my server is 1.3.0
That's strange, I did't encounter any problem you mentioned above, I can ensure that our codebase has no bugs. I indeed met an issue when I used version 1.4.4, after I switched to 1.3.0, everything became fine.
I think in this case you may need to revise codes based on the issue you have.