lightning-flash icon indicating copy to clipboard operation
lightning-flash copied to clipboard

Issue with `ImageClassificationData.from_dataset`

Open funnym0nk3y opened this issue 2 years ago • 1 comments

🐛 Bug

There seems to be an issue with ImageClassificationData.from_dataset method. It fails to create the expected format, where the labels can be accessed via datamodul.labels.

To Reproduce

The error occured with the following code adapted from the example

...

datamodule=ImageClassificationData.from_datasets(
    train_dataset=train_dataset,
    val_dataset=valid_dataset,
    batch_size = 32
)


# 2. Build the task
model = ImageClassifier(backbone="efficientnet_b0", labels=datamodule.labels)

...

The datasets are created via

...

train_val_dataset = datasets.ImageFolder(train_val_folder)

....

train_dataset, valid_dataset = random_split(dataset=train_val_dataset, lengths=[no_train_images ,no_valid_images], generator=torch.Generator().manual_seed(42))

Expected behavior

I'd expect the from_dataset method to create a valid datamodule to use for training.

Environment

  • OS (e.g., Linux): Colab instance
  • Python version: 3.8 I guess
  • PyTorch/Lightning/Flash Version (e.g., 1.10/1.5/0.7): 1.13.0+cu116/1.8.6/0.8.1.post0

funnym0nk3y avatar Jan 07 '23 22:01 funnym0nk3y

@funnym0nk3y could you pls share a full example so it is clear what packages and imports you used?

Borda avatar Aug 09 '23 08:08 Borda