FlagAI
FlagAI copied to clipboard
AltCLIP has no effect on CIFAR10 after finetune
Description
I trained AltCLIP using official demo code with dataset CIFAR10. However, after 3 epochs, the finetuned weight has no effect on cifar10 images. I wonder if my way to load weight is wrong or not.
使用了官方的代码在cifar10上finetune后,对cifar中的动物图片识别反而完全失效了,想问下原因,是否是我加载权重的方式有问题?推理代码是用的demo里的。
import os
import torch
from PIL import Image
from flagai.auto_model.auto_loader import AutoLoader
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
loader = AutoLoader(
task_name="txt_img_matching",
model_name="AltCLIP-XLMR-L", # Load the checkpoints from Modelhub(model.baai.ac.cn/models)
model_dir="./checkpoints/"
)
model = loader.get_model()
tokenizer = loader.get_tokenizer()
transform = loader.get_transform()
weight_file = './checkpoints/cifar_altclip_9k/AltCLIP-XLMR-L/pytorch_model.bin'
model.load_state_dict(torch.load(weight_file, map_location='cpu')['module'])
model.eval()
model.to(device)
tokenizer = loader.get_tokenizer()
Alternatives
No response
I also finetuned on my own dataset and CIFAR10 both using official demo code,but both loss are fixed on a value and the output of finetuned model is the same value without considering the input value.
So I wonder if the finetune code has someting error about loss. Besides, I found the error in official demo code which only used the number label like 1,2 instead of the class name like dog.
[2024-01-23 05:42:53,494] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67350/ 198605 | elapsed time per iteration (ms): 434.3 | learning rate 8.804E-05 | loss 3.465734E+00 |
[2024-01-23 05:43:15,389] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67400/ 198605 | elapsed time per iteration (ms): 437.9 | learning rate 8.803E-05 | loss 3.465734E+00 |
[2024-01-23 05:43:37,527] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67450/ 198605 | elapsed time per iteration (ms): 442.8 | learning rate 8.802E-05 | loss 3.465734E+00 |
[2024-01-23 05:44:01,020] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67500/ 198605 | elapsed time per iteration (ms): 469.9 | learning rate 8.801E-05 | loss 3.465734E+00 |
[2024-01-23 05:44:21,606] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67550/ 198605 | elapsed time per iteration (ms): 411.7 | learning rate 8.799E-05 | loss 3.465734E+00 |
[2024-01-23 05:44:43,518] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67600/ 198605 | elapsed time per iteration (ms): 438.2 | learning rate 8.798E-05 | loss 3.465734E+00 |
[2024-01-23 05:45:05,781] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67650/ 198605 | elapsed time per iteration (ms): 445.3 | learning rate 8.797E-05 | loss 3.465734E+00 |
[2024-01-23 05:45:29,410] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67700/ 198605 | elapsed time per iteration (ms): 472.6 | learning rate 8.796E-05 | loss 3.465734E+00 |
[2024-01-23 05:45:53,111] [INFO] [logger.py:71:log_dist] [Rank -1] iteration 67750/ 198605 | elapsed time per iteration (ms): 474.0 | learning rate 8.794E-05 | loss 3.465734E+00 |