lightly icon indicating copy to clipboard operation
lightly copied to clipboard

Variable number of transformed samples

Open EelcoHoogendoorn opened this issue 4 years ago • 5 comments

Im working on some methods where the number of positive samples considered per original datapoint is a pretty important variable. Currently, the lightly codebase seems to assume pairs of two everywhere.

Now its not too hard to hack a different number in that works for my current purposes; but is there any appetite for a proper generalization to a configurable number of transformed samples per datapoint? Is this something that has been considered before; or would it run into trouble with some parts of the design?

Ideally this would come in the form of a backwards-incompatible API change, where the dataloader returns something of shape [batch, n, whatever]; which is cleaner for large n than some huge tuple you have to concat afterwards. But I get people dont like API changes, so id be happy to settle for longer tuples for now.

EelcoHoogendoorn avatar Apr 09 '21 09:04 EelcoHoogendoorn

I would even go a step more general saying that the return value should not be a single tensor of shape [batch, n, whatever] but some representation that allows for variable shape of the transformations as well. e.g. the SwAV paper returns up to 8 transformed images of shape 2×224 + 6×96. I would like to change to a solution that could still handle the SwAV case in a rather simple way.

We should also consider changing the way the models work. Because they also assume inputs of tuples. That can be quite annoying/ restrictive when trying out new things. But this should be discussed in another issue.

IgorSusmelj avatar Apr 09 '21 13:04 IgorSusmelj

Good point, we should definitely work on that asap so we become more flexible!

In my opinion, the only way to work with flexible input shapes is to use a list/tuple of transformed batches - do you guys see any alternatives?

philippmwirth avatar Apr 12 '21 06:04 philippmwirth

@philippmwirth and I just had a chat and we didn't figure out a smooth way yet to cover all the cases.

On one hand, we try to standardize the interfaces between the models to make them "exchangeable". On the other hand, it should still be easy to implement custom models/ loss functions/ augmentations. We would propose postponing a solution until we have implemented a few more papers with vary in augmentations. E.g. DetCon or SwAV. Once we tried implementing the other models we will know where we should tweak them.

Please, feel free to add any other suggestions to this thread. Every input is welcome :)

IgorSusmelj avatar Apr 12 '21 14:04 IgorSusmelj

What ive hacked into my dev branch is to just add a parameter for the number of augmentations to the collate mechanism; and just work with the tuple that is returned. Dont have the overview to see if that runs into any trouble in other parts of the code; but with a default set to 2 I suppose it should be fully backwards compatible. Its not a complete solution but it does solve my immediate problem.

EelcoHoogendoorn avatar Apr 12 '21 19:04 EelcoHoogendoorn

That's also the approach I would have taken :) definitely a good fix for the moment

philippmwirth avatar Apr 13 '21 05:04 philippmwirth

We refactored augmentations and the return type of the transforms is now a list of tensors. Each tensor contains views for one batch of images.

guarin avatar Aug 25 '23 11:08 guarin