Resize(feature_extractor.size) is a dictionary, not an int or sequence
System Info
Hello, this bug was already once addressed (https://discuss.huggingface.co/t/image-classification-tutorial-bug/37267) but on google it arose again today when running this script.
from torchvision.transforms import (
CenterCrop,
Compose,
Normalize,
RandomHorizontalFlip,
RandomResizedCrop,
Resize,
ToTensor,
)
normalize = Normalize(mean=feature_extractor.image_mean, std=feature_extractor.image_std)
train_transforms = Compose(
[
RandomResizedCrop(feature_extractor.size),
RandomHorizontalFlip(),
ToTensor(),
normalize,
]
)
val_transforms = Compose(
[
Resize(feature_extractor.size), ## this is the error line
CenterCrop(feature_extractor.size),
ToTensor(),
normalize,
]
)
def preprocess_train(example_batch):
"""Apply train_transforms across a batch."""
example_batch["pixel_values"] = [
train_transforms(image.convert("RGB")) for image in example_batch["image"]
]
return example_batch
def preprocess_val(example_batch):
"""Apply val_transforms across a batch."""
example_batch["pixel_values"] = [val_transforms(image.convert("RGB")) for image in example_batch["image"]]
return example_batch
I tried installing the exact version of transformers but did not help: https://stackoverflow.com/questions/76142308/fixerror-typeerror-size-should-be-int-or-sequence-got-class-dict
Here is my transformers data
- `transformers` version: 4.23.1
- Platform: Linux-6.1.58+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.22.2
- PyTorch version (GPU?): 2.2.1+cu121 (False)
- Tensorflow version (GPU?): 2.15.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.8.2 (cpu)
- Jax version: 0.4.26
- JaxLib version: 0.4.26
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help?
@amyeroberts
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [X] My own task or dataset (give details below)
Reproduction
- Get my dataset from https://huggingface.co/datasets/samokosik/clothes
- Try preprocess it via the script I submitted.
- You will get this error,
Expected behavior
Not assigning a dictionary to size.
Hi @dehlong, thanks for opening an issue!
For future issues, please make sure to share:
- The running environment: run
transformers-cli envin the terminal and copy-paste the output - Information about the error encountered, including full traceback
The feature extractors for vision models have been deprecated for a while now (over a year), with image processors taking their place. The image processors have size stored as a dictionary. This is to disambiguate resizing behaviour, as previously size could be used to define the shortest edge, or to define the height and width.
You can see up-to-date examples of how to use them in our examples scripts e.g. here for image classification.
The updated script would look like this:
from torchvision.transforms import (
CenterCrop,
Compose,
Normalize,
RandomHorizontalFlip,
RandomResizedCrop,
Resize,
ToTensor,
)
size = image_processor.size
if "height" in size:
crop_size = (size["height"], size["width"])
resize_size = (size["height"], size["width"])
elif "shortest_edge" in size:
crop_size = resize_size = size["shortest_edge"]
normalize = Normalize(mean=image_processor.image_mean, std=image_processor.image_std)
train_transforms = Compose(
[
RandomResizedCrop(crop_size),
RandomHorizontalFlip(),
ToTensor(),
normalize,
]
)
val_transforms = Compose(
[
Resize(resize_size),
CenterCrop(crop_size),
ToTensor(),
normalize,
]
)
def preprocess_train(example_batch):
"""Apply train_transforms across a batch."""
example_batch["pixel_values"] = [
train_transforms(image.convert("RGB")) for image in example_batch["image"]
]
return example_batch
def preprocess_val(example_batch):
"""Apply val_transforms across a batch."""
example_batch["pixel_values"] = [val_transforms(image.convert("RGB")) for image in example_batch["image"]]
return example_batch
Out of interest - where did you get this example from? I would be great to know in case there's places in our resource or documentation we need to make sure are updated.
Hello,
thank you the reply. However, may I ask whether there are any caveats with the image processor?
size = image_processor.size
^^ This line gives out error that image_processor is not defined. I tried importing it directly from transformers, however with no success. Or is there something I am unaware of and I am supposed to build my own image processor like displayed in the code you sent?
also, regarding the example: I got it from Rajistics (https://www.youtube.com/watch?v=ahgB8c_TgA8)
Yes, you need to define the image processor in the same way you defined the feature extractor
from transformers import AutoImageProcessor
image_processor = AutoImageProcessor.from_pretrained(checkpoint)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.