easy-few-shot-learning icon indicating copy to clipboard operation
easy-few-shot-learning copied to clipboard

Cannot reproduce the claimed 98% test accuracy after training in my_first_few_shot_classifier

Open martin0258 opened this issue 1 year ago • 4 comments

Problem I'm running my_first_few_shot_classifier.ipynb without modifying anything, and I found that my test accuracy after training was only 86.90%, i.e., almost no improvement compared with no training at all (86.44%), a large gap compared to the claimed 98% test accuracy in the notebook.

Eval before training

100/100 [00:00<00:00, 197.05it/s]
Model tested on 100 tasks. Accuracy: 86.44%

Training log (the loss jumped between 0.2X ~ 0.3X)

100%|████████████████████| 40000/40000 [07:58<00:00, 83.61it/s, loss=0.305]

Eval after training

100/100 [00:00<00:00, 194.86it/s]
Model tested on 100 tasks. Accuracy: 86.90%

Considered solutions Not yet, maybe it's caused by seed or differnt package versions? The loss was not going down during training may a big clue of the reproducibiilty problem.

How can we help Any ideas what I may miss?

My environment

  • GPU: RTX 3090 (24GB RAM)
  • CUDA: nvcc --version (Cuda compilation tools, release 11.8, V11.8.89)
  • OS: Ubuntu 20.04
  • Python 3.10
  • Python package version
easyfsl==1.5.0
torch==2.1.2
torchvision==0.16.2
  • resnet18 pretrained weight used before training (https://download.pytorch.org/models/resnet18-f37072fd.pth)

martin0258 avatar Jan 30 '24 09:01 martin0258

Hi. I ran the notebook on Colab and could not reproduce the error:

  • the loss decreases during training (from ~1.2 to ~0.25)
  • 98.18% accuracy during evaluation

My versions for easyfsl, torch and torchvision match yours.

The random seed seems like an unlikely cause for this kind of difference in the result. You may try your theory by fixing the seeds and running the training again.

Did you make any change to the notebook?

ebennequin avatar Jan 30 '24 10:01 ebennequin

@ebennequin Hi, thanks for the quick reply!

The only change I maded was commmed out the following two lines to avoid using your trained weight:

# !wget https://public-sicara.s3.eu-central-1.amazonaws.com/easy-fsl/resnet18_with_pretraining.tar
# model.load_state_dict(torch.load("resnet18_with_pretraining.tar", map_location="cuda"))

martin0258 avatar Jan 30 '24 11:01 martin0258

@ebennequin Update: I just ran Colab (change nothing) and after training with T4 GPU I got Accuracy: 90.90%, better than my local environment but still a large gap compared with your result: image

martin0258 avatar Jan 30 '24 14:01 martin0258

I also didn't load the pretrained weights. I'm sorry I am unable to reproduce nor explain this gap in the results.

I am keeping this issue open for now. Feel free to share any new findings that could help us solve the issue.

ebennequin avatar Feb 01 '24 10:02 ebennequin