modAL Data augmentation with `skorch`

Data augmentation with `skorch`

Open arthur-thuy opened this issue 2 years ago • 0 comments

I am using modAL with skorch for integration with PyTorch. Most tutorials, such as the Pytorch models in modAL workflows tutorial from modAL, use the MNIST dataset which commonly has the same data transformation in train, pool, validation, and test sets.

From the tutorial:

mnist_data = MNIST('.', download=True, transform=ToTensor())
dataloader = DataLoader(mnist_data, shuffle=True, batch_size=60000)
X, y = next(iter(dataloader))

I would like to use data augmentation for more complex computer vision problems. The difficulty is that the augmentation should only be applied to the train set, and not to the pool set. The modAL tutorial creates a DataLoader with 1 large batch, applies data transformations, splits it into labelled (train) and unlabeled (pool) sets and feeds this to the modAL functions.

As such, observations moving from pool set to labeled set should receive additional transformations. However, I have difficulties with applying these transformations because the PyTorch transforms expect PIL images.

What would be the best way to handle this?

Thank you

Jun 22 '23 15:06 arthur-thuy

modAL modAL copied to clipboard

Data augmentation with `skorch`

modAL
modAL copied to clipboard