lightly
lightly copied to clipboard
Philipp lig 1511 implement smog
DRAFT: Add SMoG
Draft implementation of SMoG: https://arxiv.org/pdf/2207.06167.pdf
Currently, the CIFAR10 benchmark reaches approximately 65%
accuracy after 200 epochs (depending on the exact set of hyperparameters).
There are some questions and problems I'm encountering:
- The loss behaves strangely. After resetting the cluster centers, it grows quickly and then starts going down again.
- In the paper, they clearly state that they use a prediction head on top of the momentum network:
In my opinion this doesn't make sense because there's no gradient going through the momentum network. I just left it away in this draft.
- There's a possible typo in the paper:
I used
argmax
instead. - Finally, they mention in the paper that they're using an asymmetric version of the BYOL augmentations where one of the augmentations uses stronger solarization and the other stronger Gaussian blur. However, there's no further information about this in the paper so I simply used SimCLR augmentations.
Note: The repeated KMeans clustering slows down training quite a bit.
Codecov Report
Base: 89.54% // Head: 89.01% // Decreases project coverage by -0.53%
:warning:
Coverage data is based on head (
1cdf457
) compared to base (7cf37f0
). Patch coverage: 38.88% of modified lines in pull request are covered.
Additional details and impacted files
@@ Coverage Diff @@
## master #888 +/- ##
==========================================
- Coverage 89.54% 89.01% -0.54%
==========================================
Files 99 99
Lines 4525 4561 +36
==========================================
+ Hits 4052 4060 +8
- Misses 473 501 +28
Impacted Files | Coverage Ξ | |
---|---|---|
lightly/data/collate.py | 92.85% <25.00%> (-4.61%) |
:arrow_down: |
lightly/models/modules/heads.py | 84.25% <38.46%> (-14.53%) |
:arrow_down: |
lightly/models/modules/__init__.py | 100.00% <100.00%> (ΓΈ) |
|
lightly/api/api_workflow_download_dataset.py | 87.36% <0.00%> (-6.32%) |
:arrow_down: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Sorry for the typos.
- The prediction head is added on f_\theta.
- argmin -> argmax in Eq.4.
- SimCLR's augmentation is OK for SMoG.
Sorry for the typos.
1. The prediction head is added on f_\theta. 2. argmin -> argmax in Eq.4. 3. SimCLR's augmentation is OK for SMoG.
Thanks a lot for the help @BoPang1996 π I will try with your suggestions and post an update.
Current best result:
------------------------------------------------------------------------------------------ β
| Model | Batch Size | Epochs | KNN Test Accuracy | Time | Peak GPU Usage | β
------------------------------------------------------------------------------------------ β
| SMoG | 512 | 200 | 0.778 | 343.6 Min | 3.9 GByte |
------------------------------------------------------------------------------------------
Todo:
- [ ] Get best hyperparameters for Cifar10 benchmarks
- [x] Add Imagenette benchmarks
- [x] Implement
SMoGPredictionHead
- [x] Implement
SMoGCollateFunction
(asymmetric) - [ ] Add examples to docs