torch icon indicating copy to clipboard operation
torch copied to clipboard

Implement `dataset_map`?

Open dfalbel opened this issue 4 years ago • 1 comments

It would take a dataset and a function to be applied after .getitem. The idea is similar to dataset_subset() although there's no dataset_map() in PyTorch's API as far as I can tell.

dfalbel avatar Aug 25 '21 20:08 dfalbel

can it be like :

dataset_map <- dataset(
  initialize = function(dataset, fun) {
    self$dataset = dataset
    self$fun = fun
  },
  
  .getitem = function(idx) {
    td <- tensor_dataset(self$fun(self$dataset$tensors[[1]]) , self$dataset$tensors[[2]])
    return(td[idx])
  },
  
  .length = function() {
    return(length(self$dataset))
  }
)

and used like

#test that
x <- torch_randn(20, 5) 
y <- torch_randn(20)
  
data <- tensor_dataset(x, y)

a <- dataset_map(data , function (x) x + 100)

mohamed-180 avatar Aug 27 '21 16:08 mohamed-180