Add Unlabeled Image Dataset for Unsupervised Training
π The feature
Iβm proposing to add a dataset class for unsupervised learning (e.g., generative models), where the dataset consists of a flat folder of unlabeled images.
Introduce a new class, e.g. UnlabeledImageDataset, that:
- Accepts a flat folder of image files
- Returns only images (no labels)
- Follows
ImageFolderconventions where applicable - Resides in
torchvision/datasets/folder.pyand reuses existing utilities - Introducing a new class avoids increasing complexity in
ImageFolder
Motivation, pitch
torchvision.datasets.ImageFolder and DatasetFolder are designed for supervised tasks, requiring a specific directory structure and class-label mappings. In unsupervised scenarios, I end up writing custom datasets for this case. A built-in dataset would improve usability and consistency across the PyTorch ecosystem.
This feature request is similar in spirit to Issue #660, where a user suggested supporting unlabeled or unsupervised datasets. The use case remains common, and a lightweight, built-in solution would reduce boilerplate and improve consistency.
Alternatives
An alternative would be to have an "unsupervised" mode for ImageFolder as suggested in Issue #660. But that would result in increased complexity in this class as pointed out in the comment of the issue.
Additional context
It feels like this functionality belongs in a common library especially that ImageFolder is already present in torchvision.
Would you be open to adding this? Iβd be happy to contribute a PR if thereβs interest.
Thanks!
Thanks for the feature request @mduszyk . Can you share a bit more about the API you have in mind? Naively this sounds like a shallow wrapper around Pathlib.glob()?
I was thinking about sth like this:
from pathlib import Path
from torchvision.io import read_image, ImageReadMode
class UnlabeledImageFolder:
def __init__(self, root_dir, patterns=('**/*.jpg', '**/*.png'), transform=None):
self.root = Path(root_dir)
self.images = []
for pattern in patterns:
self.images.extend(self.root.glob(pattern))
self.transform = transform
def __len__(self):
return len(self.images)
def __getitem__(self, i):
img = read_image(self.images[i], ImageReadMode.RGB)
if self.transform:
img = self.transform(img)
return img
It uses glob allowing for multiple patterns, loads the image and performs optional transformation.
One more idea is to unify the ImageFolder API, it could, depending on init parameters, internally instantiate LabeledImageFolder or UnlabeledImageFolder. This way both implementations would be separate and user would see single API.
Looking forward to learn about your thoughts on this.
Thanks for the details. I think this is reasonable but I hope we can support that with the existing ImageFolder or DatasetFolder, perhaps with some minor modifications. Can you check if allow_empty=True supports what you need already?
DatasetFolder and ImageFolder that extends it assume certain
directory structure where subdirectories of the dataset root directory are
considered to be classes. Then dataset returns (input, target) pairs. I was
hoping to be able to also work with datasets where there are no targets,
ie. only the image is returned by __getitem__.
Here is general view of this in the code:
class DatasetFolder(VisionDataset):
...
def find_classes(self, directory: Union[str, Path]) -> tuple[list[str], dict[str, int]]:
"""Find the class folders in a dataset structured as follows::
directory/
βββ class_x
β βββ xxx.ext
β βββ xxy.ext
β βββ ...
β βββ xxz.ext
βββ class_y
βββ 123.ext
βββ nsdf3.ext
βββ ...
βββ asd932_.ext
...
"""
class ImageFolder(DatasetFolder):
...
allow_empty=True makes it consider empty folders to be classes with zero
samples, instead of raising exception in such case. So this does not help
if we wanted to allow users to load images from a flat folder.
I was thinking initially about extending VisionDataset to be compatible
with torchvision datasets and keep implementation separate. However, it would
also make sense to modify DatasetFolder and keep the API consistent.
Let me know your thoughts on this, and if you are interested, I could propose
modifications to DatasetFolder.
Hi @mduszyk , sorry for the late reply.
OK, I'm happy to consider a PR for this. If you submit it, please make sure to write a test similar to https://github.com/pytorch/vision/blob/98f8b3757c0648724064ca95434b18281c43c5f6/test/test_datasets.py#L1737C1-L1737C2
To keep the API as close as possible to ImageFolder, let's have the loader parameter as well. On the default patterns, I think it should be None and basically be the combination of all
https://github.com/pytorch/vision/blob/98f8b3757c0648724064ca95434b18281c43c5f6/torchvision/datasets/folder.py#L257
So something like
if patterns is None:
patterns = [f"{**/*{ext}" for ext in IMG_EXTENSIONS]