lightly icon indicating copy to clipboard operation
lightly copied to clipboard

Reproducing SimCLR on ImageNet-1K from provided pretrained ckpt

Open XilinGong opened this issue 8 months ago • 5 comments

I download the ckpt provided in this project which is named "epoch=99-step=500400.ckpt" and use the linear_eval code in benchmark script to evaluate it. However, with the default settings (lr=0.1) I got NaN when training. I changed it into 1e-4 to avoid NaN but the accuracy is still very low, only about 1%, very far from the reported accuracy. I also tried the KNN_eval and the accuracy is also only about 0.5%.

How can I solve this problem? Is there any other change required in this code? I have made sure that the checkpoint is loaded correctly.

Thank you so much

XilinGong avatar May 07 '25 01:05 XilinGong

Thanks for the issue!

Was able to reproduce this on the main branch:

Image

Ran it with:

python main.py --train-dir /datasets/imagenet1k/train --val-dir /datasets/imagenet1k/val/ --epochs 0 --ckpt-path ../../../epoch\=99-step\=500400.ckpt --skip-linear-eval --skip-finetune-eval --methods simclr

Will investigate and get back to you ASAP :)

guarin avatar May 09 '25 14:05 guarin

Looks like this might be related to the changes we introduced in #1800

I tried with commit b6955fd40b9b8e2f11cbd6d291820281ed47ba3a (v1.5.18) and got the expected results:

Image

So the checkpoint weights are good but there is an issue in the evaluation code.

guarin avatar May 09 '25 14:05 guarin

Thanks for your help! I can get expected results with KNN_eval now. However, do you have any idea about linear_eval?

XilinGong avatar May 09 '25 18:05 XilinGong

Hi @XilinGong ! We could reproduce the wrong knn results but not linear eval

Image

Could you elaborate on what was wrong with your linear eval and your settings?

yutong-xiang-97 avatar May 22 '25 12:05 yutong-xiang-97

Hi @XilinGong, we were able to fix the problem with reproducing KNN results. This is due to the changes in default knn_t which is different from what was used for SimCLR. The PR https://github.com/lightly-ai/lightly/pull/1840 should fix this.

Here is the KNN eval result after the fix

Image

Thank you again for making an issue to help us improve!

yutong-xiang-97 avatar Jun 03 '25 09:06 yutong-xiang-97