MLDatasets.jl icon indicating copy to clipboard operation
MLDatasets.jl copied to clipboard

Add FastAI datasets

Open lorenzoh opened this issue 3 years ago • 5 comments

This adds datadeps of all datasets from the FastAI dataset collection. As proposed in DLDatasets.jl#1.

The basic functionality:

using MLDatasets.FastAIDatasets
using MLDatasets.FastAIDatasets:
    datasetpath,  # download dataset and get directory
    DATASETS,  # list of all datasets
    loaddataclassification,  # load an image classification dataset into a data container with observations `(image, class)` 
    loaddatasegmentation,  # load an image segmentation dataset into a data container with observations `(image, mask)`  


lorenzoh avatar Feb 25 '21 18:02 lorenzoh

What's the reason for having loaddatasegmentation / loaddataclassification and not just loaddata?

CarloLucibello avatar Feb 25 '21 21:02 CarloLucibello

They're supposed to work on any folder containing the dataset in the right format, not just the included datasets. Also some datasets can be used for multiple different tasks, so there is no 1-to-1 mapping.

lorenzoh avatar Feb 26 '21 15:02 lorenzoh

Is this PR still desired/feasible?

MariusDrulea avatar Nov 26 '22 23:11 MariusDrulea

I will leave the decision on whether this is desirable to more active maintainers of this repository, but will note that implementation-wise a lot of things have changed since this PR was first opened. The largest change is that LearnBase.jl+MLDataPattern.jl have been superseded by MLUtils.jl. From the FastAI.jl side, I am always happy to take stuff out and move it into a more canonical package, as would be the case with these datasets.

lorenzoh avatar Nov 27 '22 10:11 lorenzoh

I would be in favor of moving these datasets here, provided we manage to make the interface consistent with the other datasets here. I don't know how hard that would be.

CarloLucibello avatar Nov 27 '22 13:11 CarloLucibello