JigsawClustering icon indicating copy to clipboard operation
JigsawClustering copied to clipboard

It seems that the model has not learned anything,What should I do?

Open zbw0329 opened this issue 3 years ago • 10 comments

Thanks for your excellent work! I change the dataloader to use JigClu in CIFAR-10,and train the model on it by 1000epoch. But the prediction of my model is all the same. It seem that model always cluster into the same cluster

zbw0329 avatar Nov 20 '21 04:11 zbw0329

When I use linear_evaluation to evaluate my model,should I train on main_lincls.py first?

zbw0329 avatar Nov 20 '21 04:11 zbw0329

1637381240(1)

zbw0329 avatar Nov 20 '21 04:11 zbw0329

I think you only need to change the dataloader and augmentation to CIFAR-10 and change the num_classes=10. And check if you can load the pretrained weights correctly.

akuxcw avatar Nov 22 '21 02:11 akuxcw

What is the right order of training? Use main.py first to get checkpoint? And than use the main_lincls.py to evaluate the checkpoint of main.py? Or use the checkpoint as resume and train on main_lincls.py to get linear_checkpoint?

zbw0329 avatar Nov 24 '21 02:11 zbw0329

I get 37 ACC1 on CIFAR-10 dataset after 300 epochs,which is far from the result of your paper. I use the learning rate of 10,should I make it smaller? 1637721732(1)

zbw0329 avatar Nov 24 '21 02:11 zbw0329

I'm confused about the order in which main.py and main_lincls.py are used

zbw0329 avatar Nov 24 '21 02:11 zbw0329

You should, 1. use main.py to get the checkpoint; 2. use main_lincls.py to load the checkpoint as pretrained weights (not resume training).

The results of CIFAR-10 in the paper are produced using ImageNet pretrained weights. I didn't try directly pretraining on CIFAR-10.

Actually, 37% precision of linear evaluation shows the model's weight is not random. It seems the model learns some features, but not that good. The reason may lie in inappropriate hyper-parameters. Or maybe the CIFAR-10 dataset is too easy to learn, which makes the model's outputs became unchanged during the training process.

akuxcw avatar Nov 24 '21 02:11 akuxcw

Oh,I see. I use the main_lincls.py to get the checkpoint and than use it to evaluate. I will retry with the order you offer,thanks for your help.

zbw0329 avatar Nov 24 '21 03:11 zbw0329

In the Table 5 of your paper,what is the different between 'finetune' and 'linear'? 1637805959(1) Is there any difference in their experimental process? Are their assessment methods different?

zbw0329 avatar Nov 25 '21 02:11 zbw0329

In "linear", we load the pretrained weights and fix the backbone, then only train a classifier. In "fine-tune" we load the pretrained weights as initialization and train the whole model normally. They are two different ways to measure the quality of pretrained weights.

akuxcw avatar Nov 29 '21 08:11 akuxcw