Efficient-AI-Backbones icon indicating copy to clipboard operation
Efficient-AI-Backbones copied to clipboard

Evaluation Mode doesn't work.

Open Leanna97 opened this issue 1 year ago • 5 comments

Thanks for your great work. I train my 'vig_ti_224_gelu' model and the accuracy reaches 78.22%. These results are from the summary.csv generated during training. When I try to resume the saved checkpoint using '--resume', it works well. However, when I tried to evaluate the saved model using '--evaluate', the results become 1.04% and the loss is nan. Could anyone help me to solve this problem?

summary.csv: image

'--evaluate': image

Leanna97 avatar Apr 28 '23 10:04 Leanna97

Thanks for your great work. I train my 'vig_ti_224_gelu' model and the accuracy reaches 78.22%. These results are from the summary.csv generated during training. When I try to resume the saved checkpoint using '--resume', it works well. However, when I tried to evaluate the saved model using '--evaluate', the results become 1.04% and the loss is nan. Could anyone help me to solve this problem?

summary.csv: image

'--evaluate': image

Hi, Have you found a solution? I have the same problem。

YatingHuang7 avatar Aug 14 '23 14:08 YatingHuang7

I think it's the version issue. Please refer to https://github.com/huawei-noah/Efficient-AI-Backbones/issues/219#issuecomment-1649928890

iamhankai avatar Aug 15 '23 11:08 iamhankai

Same problem, I tried torchvision==0.8.2, torch==1.7.1, timm==0.3.2 and CUDA11.0. But didn't work in evaluate mode.

FreeZ3e avatar Mar 23 '24 05:03 FreeZ3e

我通过改变imagenet的文件结构解决了问题。我的imagenet结构中,val里面是很多jpeg,而train里面是很多文件夹。根据#219老哥的colab,应该是val里面很多文件夹,train里面1个文件夹。因此只需要将二者的名称交换,就能够输出正确的结果。

xxrrnn avatar Apr 11 '24 13:04 xxrrnn

I solved this problem by using "--resume" to load pretrained model instead of "--pretrain_path". If you want use "--pretrain_path" to evaluate model, plz use "torch.save()" to save model first.

FreeZ3e avatar Apr 25 '24 09:04 FreeZ3e