DRAFT: Add SMoG

Draft implementation of SMoG: https://arxiv.org/pdf/2207.06167.pdf

Currently, the CIFAR10 benchmark reaches approximately 65% accuracy after 200 epochs (depending on the exact set of hyperparameters).

There are some questions and problems I'm encountering:

The loss behaves strangely. After resetting the cluster centers, it grows quickly and then starts going down again.
In the paper, they clearly state that they use a prediction head on top of the momentum network: In my opinion this doesn't make sense because there's no gradient going through the momentum network. I just left it away in this draft.
There's a possible typo in the paper: I used argmax instead.
Finally, they mention in the paper that they're using an asymmetric version of the BYOL augmentations where one of the augmentations uses stronger solarization and the other stronger Gaussian blur. However, there's no further information about this in the paper so I simply used SimCLR augmentations.

Note: The repeated KMeans clustering slows down training quite a bit.

Aug 04 '22 14:08 philippmwirth

Codecov Report

Base: 89.54% // Head: 89.01% // Decreases project coverage by -0.53% :warning:

Coverage data is based on head (1cdf457) compared to base (7cf37f0). Patch coverage: 38.88% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #888      +/-   ##
==========================================
- Coverage   89.54%   89.01%   -0.54%     
==========================================
  Files          99       99              
  Lines        4525     4561      +36     
==========================================
+ Hits         4052     4060       +8     
- Misses        473      501      +28

Impacted Files	Coverage Δ
lightly/data/collate.py	`92.85% <25.00%> (-4.61%)`	:arrow_down:
lightly/models/modules/heads.py	`84.25% <38.46%> (-14.53%)`	:arrow_down:
lightly/models/modules/__init__.py	`100.00% <100.00%> (ø)`
lightly/api/api_workflow_download_dataset.py	`87.36% <0.00%> (-6.32%)`	:arrow_down:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Aug 04 '22 14:08 codecov[bot]

Sorry for the typos.

The prediction head is added on f_\theta.
argmin -> argmax in Eq.4.
SimCLR's augmentation is OK for SMoG.

Aug 04 '22 15:08 BoPang1996

Sorry for the typos.

1. The prediction head is added on f_\theta.

2. argmin -> argmax in Eq.4.

3. SimCLR's augmentation is OK for SMoG.

Thanks a lot for the help @BoPang1996 🙂 I will try with your suggestions and post an update.

Aug 04 '22 15:08 philippmwirth

Current best result:

------------------------------------------------------------------------------------------                           │
| Model         | Batch Size | Epochs |  KNN Test Accuracy |       Time | Peak GPU Usage |                           │
------------------------------------------------------------------------------------------                           │
| SMoG          |        512 |    200 |              0.778 |  343.6 Min |      3.9 GByte |
------------------------------------------------------------------------------------------

Todo:

[ ] Get best hyperparameters for Cifar10 benchmarks
[x] Add Imagenette benchmarks
[x] Implement SMoGPredictionHead
[x] Implement SMoGCollateFunction (asymmetric)
[ ] Add examples to docs

Aug 12 '22 13:08 philippmwirth

lightly
lightly copied to clipboard

Philipp lig 1511 implement smog

DRAFT: Add SMoG

Codecov Report

lightly lightly copied to clipboard

Philipp lig 1511 implement smog

DRAFT: Add SMoG

Codecov Report

lightly
lightly copied to clipboard