Question on Pretext Tasks
Hello, many thanks for your efforts in introducing LightlyTrain. I am new to Self-Supervised Learning, and would appreciate your response to a conceptual question I have. As far as I understand, SSL requires a pretext task to be defined and solved, and that pretext tasks are the cornerstone of SSL. In LightlyTrain, how can these tasks be defined or configured during pre-training? Is this the custom augmentation transforms configuration? Is it only available in LightlySSL? I checked the LightlyTrain documentation and haven't found anything on pretext tasks.
Thank you.
Hi @uhijjawi and thank you for your question!
In comparison to supervised learning, SSL indeed uses so-called pretext tasks as objectives for the model.
One such pretext task may be to learn invariances, which is a cornerstone in techniques like contrastive learning and self-distillation: The goal there is to teach the model to give similar predictions for images under certain augmentations, i.e. the model should become invariant to these augmentations. I would have a look at the SimCLR (contrastive learning), BYOL and DINO papers (both self-distillation) to better understand how this works.
Another such pretext task may be that the model should learn the image semantics by being able to fill in masked-out parts of the image. This is the concept behind e.g. SimMIM or MAE.
There are of course more such pretext tasks, but those are the most prominent ones. In LightlyTrain we further support distillation (which is not strictly SSL), but there the pretext task is that the predictions of the small (untrained) model should correspond to the ones of the larger (already trained) model.
If you'd like to actually understand how SSL works I would recommend that you stick with LightlySSL for now. Here we provide building blocks such that you can build your own SSL pipelines or to quickly compare existing approaches to each other. All currently supported models (each with their own pretext tasks) are listed in the Models section of the docs.
LightlyTrain on the other hand provides production-ready SSL and distillation. Its design goal is therefore not education about SSL or SSL research, but ease-of-use. For this reason, those pretext tasks are mostly hidden from the user.
I hope this answers your question and I recommend joining our Discord, where you can learn from other people that employ SSL.
Many thanks for the informative answer. I have joined Lightly at Discord too!
Another question please, I'm reading the Models section of the LightlySSL documentation as suggested, where can I find the pretext tasks used for each method/model? (e.g., pretext tasks are the Data Augmentations in SimCLR) https://docs.lightly.ai/self-supervised-learning/examples/simclr.html#
Hi @uhijjawi,
I think in order to fully understand what the pretext tasks are, you will have to actually look through the research papers of the individual methods. The links to the papers are also listed on the model pages, usually in the Reference section.
The pretext tasks are not only the augmentations, but are usually a combination of several things, for example 1. augmentations, 2. loss function and 3. optimization (e.g. self-distillation uses exponential moving averages for the teacher updates).