MLDatasets.jl icon indicating copy to clipboard operation
MLDatasets.jl copied to clipboard

Add `transform` keyword to dataset constructors to store a transformation

Open CarloLucibello opened this issue 2 years ago • 3 comments

Both HuggingFace and torchvision's datasets can store a transformation that is applied on the fly when indexing a dataset. I think this is generally very convenient and we should add it to every dataset.

CarloLucibello avatar May 28 '22 11:05 CarloLucibello

I feel it may be cleaner to tell users to use mapobs and give some examples. AFAICT there would be no functional difference, right?

lorenzoh avatar May 28 '22 12:05 lorenzoh

Yes maybe there is no advantage over a mapobs. I'll keep the issue open until we add some examples in the docs.

CarloLucibello avatar May 29 '22 16:05 CarloLucibello

The recently introduced TorchData could be a source of inspiration https://pytorch.org/data/beta/index.html

CarloLucibello avatar Jun 15 '22 15:06 CarloLucibello