CLIMS icon indicating copy to clipboard operation
CLIMS copied to clipboard

How to obtain pre-trained baseline CAM

Open Spritea opened this issue 2 years ago • 14 comments

Hi, I have a question about how to obtain the pre-trained baseline CAM (res50_cam.pth).

Currently this repo directly provides the checkpoint, and I'd like to know how this model is trained. Could you please explain?

And if I want to use CLIMS on a custom dataset, do I need to retrain the res50_cam.pth based on my custom dataset? Thanks!

Spritea avatar Feb 20 '23 22:02 Spritea

Hi, setting train_cam_pass=True, train_clims_pass=False to train get baseline cam. It is just supervised by a multi-label classification loss.

Sierkinhane avatar Feb 21 '23 02:02 Sierkinhane

I see, so if I want to use CLIMS on my custom dataset, I will need to re-train the res50_cam.pth model with a multi-label classification loss, since the provided checkpoint is trained only on the PASCAL VOC 2012, right?

Spritea avatar Feb 21 '23 03:02 Spritea

Exactly. Better with pre-training,

Sierkinhane avatar Feb 21 '23 04:02 Sierkinhane

Thanks for the explanation! I will try this.

Spritea avatar Feb 21 '23 04:02 Spritea

Hi, I trained the res50_cam.pth on the PASCAL VOC dataset, and used it to train the CLIMS on PASCAL VOC. The training process went well. But the CLIMS performance that uses the res50_cam.pth trained by myself is worse than the one that uses the res50_cam.pth provided by you, i.e., 55.38% vs. 58.27% on mIoU.

My training command for training the res50_cam.pth is:

CUDA_VISIBLE_DEVICES=0 python run_sample.py --voc12_root data/VOC2012/ --hyper 10,24,1,0.2 --cam_eval_thres 0.15 --work_space clims_voc12_for_res50_cam --cam_network net.resnet50_clims --train_cam_pass True --train_clims_pass False --make_clims_pass False --eval_cam_pass False

The log for training res50_cam.pth is train_res50_cam_log.txt, The log for training CLIMS with my res50_cam.pth is train_clims_my_res50_cam.txt. The log for training CLIMS with the provided res50_cam.pth is train_clims_provided_res50_cam.txt. Could you check, thanks!

Spritea avatar Feb 22 '23 16:02 Spritea

How about the performance of your 'res50_cam.pth' on train set?

Sierkinhane avatar Feb 23 '23 00:02 Sierkinhane

I directly copy the mIoU reported in the train_clims_my_res50_cam.txt and train_clims_provided_res50_cam.txt, from the step.eval_cam part. I'm not sure whether the mIoU is from training set or val set.

For your reference, I have shared my res50_cam.pth here. Hope this is helpful for the question.

Spritea avatar Feb 23 '23 01:02 Spritea

You can evaluate your res50_cam.pth on the 'train' set to get mIoU. The scripts are similar to evaluate CLIMS.

Sierkinhane avatar Feb 23 '23 02:02 Sierkinhane

For evaluation, I guess you mean the step/eval_cam.py file. This file requires the cam result as input, which I guess is generated by step/make_cam.py file. So I should use step/make_cam.py to first generate cam result with my res50_cam.pth (without CLIMS), then use step/eval_cam.py to evaluate on the train set , is that right?

Spritea avatar Feb 23 '23 02:02 Spritea

exactly.

Sierkinhane avatar Feb 23 '23 02:02 Sierkinhane

Hi, I just evaluated my res50_cam.pth (without CLIMS) on the train set, and the mIoU is 47.89%. Could you have a look?

My command for generating CAM using my res50_cam.pth:

CUDA_VISIBLE_DEVICES=0 python run_sample.py --voc12_root data/VOC2012/ --hyper 10,24,1,0.2 --work_space clims_voc12_for_res50_cam_eval --cam_network net.resnet50_cam --make_cam_pass True

My command for evaluating the CAM of my res50_cam.pth on train set:

CUDA_VISIBLE_DEVICES=0 python run_sample.py --voc12_root data/VOC2012/ --hyper 10,24,1,0.2 --work_space clims_voc12_for_res50_cam_eval --eval_cam_pass True

Spritea avatar Feb 24 '23 01:02 Spritea

Could you evaluate my 'res50_cam.pth' on train set? Maybe the mIoU of the initial model will help.

Sierkinhane avatar Feb 24 '23 04:02 Sierkinhane

Hi, I have evaluated the res50_cam.pth (without CLIMS) model provided by you on the train set, and the mIoU is 48.11%.

The evaluation log for my res50_cam.pth (without CLIMS):

{'num_workers': 12, 'voc12_root': 'data/VOC2012/', 'train_list': 'voc12/train_aug.txt', 'val_list': 'voc12/val.txt', 'infer_list': 'voc12/train_aug.txt', 'chainer_eval_set': 'train', 'cam_network': 'net.resnet50_cam', 'feature_dim': 2048, 'cam_crop_size': 512, 'cam_batch_size': 16, 'cam_num_epoches': 5, 'cam_learning_rate': 0.1, 'cam_weight_decay': 0.0001, 'cam_eval_thres': 0.15, 'cam_scales': (1.0, 0.5, 1.5, 2.0), 'num_cores_eval': 8, 'clims_network': 'net.resnet50_clims', 'clims_num_epoches': 15, 'clims_learning_rate': 0.00025, 'hyper': '10,24,1,0.2', 'clip': 'ViT-B/32', 'conf_fg_thres': 0.3, 'conf_bg_thres': 0.1, 'irn_network': 'net.resnet50_irn', 'irn_crop_size': 512, 'irn_batch_size': 32, 'irn_num_epoches': 3, 'irn_learning_rate': 0.1, 'irn_weight_decay': 0.0001, 'beta': 10, 'exp_times': 8, 'sem_seg_bg_thres': 0.2, 'work_space': 'clims_voc12_for_res50_cam_eval', 'log_name': 'clims_voc12_for_res50_cam_eval/sample_train_eval', 'cam_weights_name': 'clims_voc12_for_res50_cam_eval/res50_cam.pth', 'irn_weights_name': 'clims_voc12_for_res50_cam_eval/res50_irn.pth', 'cam_out_dir': 'clims_voc12_for_res50_cam_eval/cam_mask', 'ir_label_out_dir': 'clims_voc12_for_res50_cam_eval/ir_label', 'sem_seg_out_dir': 'clims_voc12_for_res50_cam_eval/sem_seg', 'ins_seg_out_dir': 'clims_voc12_for_res50_cam_eval/ins_seg', 'clims_weights_name': 'clims_voc12_for_res50_cam_eval/res50_clims', 'train_cam_pass': False, 'train_clims_pass': False, 'make_cam_pass': False, 'make_clims_pass': False, 'eval_cam_pass': True, 'cam_to_ir_label_pass': False, 'train_irn_pass': False, 'make_ins_seg_pass': False, 'eval_ins_seg_pass': False, 'make_sem_seg_pass': False, 'eval_sem_seg_pass': False}
step.eval_cam: Thu Feb 23 20:45:37 2023
{'aeroplane': 0.3878330494895696, 'bicycle': 0.277048223235467, 'bird': 0.4008208719410258, 'boat': 0.302372916148759, 'bottle': 0.46038248257484954, 'bus': 0.6441010496947992, 'car': 0.52437011459361, 'cat': 0.5810356518108325, 'chair': 0.2503940042009332, 'cow': 0.5563700992609669, 'dining table': 0.41537679467596306, 'dog': 0.5097696192448804, 'horse': 0.52340754056787, 'motorbike': 0.6070907393004206, 'player': 0.48949676891373317, 'potted plant': 0.4091623130159273, 'sheep': 0.5904093659220163, 'sofa': 0.472986569753616, 'train': 0.4686553897256759, 'tv monitor': 0.41682748577263534}
threshold: 0.15 miou: 0.4789238743955983 i_imgs 1464
among_predfg_bg 0.36928326345424084

The evaluation log for the res50_cam.pth (without CLIMS) model provided by you:

{'num_workers': 12, 'voc12_root': 'data/VOC2012/', 'train_list': 'voc12/train_aug.txt', 'val_list': 'voc12/val.txt', 'infer_list': 'voc12/train_aug.txt', 'chainer_eval_set': 'train', 'cam_network': 'net.resnet50_cam', 'feature_dim': 2048, 'cam_crop_size': 512, 'cam_batch_size': 16, 'cam_num_epoches': 5, 'cam_learning_rate': 0.1, 'cam_weight_decay': 0.0001, 'cam_eval_thres': 0.15, 'cam_scales': (1.0, 0.5, 1.5, 2.0), 'num_cores_eval': 8, 'clims_network': 'net.resnet50_clims', 'clims_num_epoches': 15, 'clims_learning_rate': 0.00025, 'hyper': '10,24,1,0.2', 'clip': 'ViT-B/32', 'conf_fg_thres': 0.3, 'conf_bg_thres': 0.1, 'irn_network': 'net.resnet50_irn', 'irn_crop_size': 512, 'irn_batch_size': 32, 'irn_num_epoches': 3, 'irn_learning_rate': 0.1, 'irn_weight_decay': 0.0001, 'beta': 10, 'exp_times': 8, 'sem_seg_bg_thres': 0.2, 'work_space': 'clims_voc12_for_res50_cam_eval_ori_model', 'log_name': 'clims_voc12_for_res50_cam_eval_ori_model/sample_train_eval', 'cam_weights_name': 'clims_voc12_for_res50_cam_eval_ori_model/res50_cam.pth', 'irn_weights_name': 'clims_voc12_for_res50_cam_eval_ori_model/res50_irn.pth', 'cam_out_dir': 'clims_voc12_for_res50_cam_eval_ori_model/cam_mask', 'ir_label_out_dir': 'clims_voc12_for_res50_cam_eval_ori_model/ir_label', 'sem_seg_out_dir': 'clims_voc12_for_res50_cam_eval_ori_model/sem_seg', 'ins_seg_out_dir': 'clims_voc12_for_res50_cam_eval_ori_model/ins_seg', 'clims_weights_name': 'clims_voc12_for_res50_cam_eval_ori_model/res50_clims', 'train_cam_pass': False, 'train_clims_pass': False, 'make_cam_pass': False, 'make_clims_pass': False, 'eval_cam_pass': True, 'cam_to_ir_label_pass': False, 'train_irn_pass': False, 'make_ins_seg_pass': False, 'eval_ins_seg_pass': False, 'make_sem_seg_pass': False, 'eval_sem_seg_pass': False}
step.eval_cam: Fri Feb 24 00:39:27 2023
{'aeroplane': 0.43776330336844194, 'bicycle': 0.2898690744920993, 'bird': 0.42104697295496535, 'boat': 0.35934724517947014, 'bottle': 0.4474912544130493, 'bus': 0.6073325975418234, 'car': 0.5224978877557286, 'cat': 0.4224395073545029, 'chair': 0.27233237467560495, 'cow': 0.5737923737175804, 'dining table': 0.38645536061061486, 'dog': 0.4528349697951786, 'horse': 0.5047025024011766, 'motorbike': 0.6040080714806751, 'player': 0.5333809823476624, 'potted plant': 0.44080811162720984, 'sheep': 0.622085347911287, 'sofa': 0.44851732806495437, 'train': 0.5151142389572412, 'tv monitor': 0.44854477412047417}
threshold: 0.15 miou: 0.4811906300786135 i_imgs 1464
among_predfg_bg 0.3087233065778073

Spritea avatar Feb 24 '23 05:02 Spritea

Hi, sorry for the late reply. The mIoU of reproduced baseline cam is lower than that of this reponsitory, e.g, aeroplane and boat. I'm not sure whether it is the reason for the lower performance of CLIMS you reproduced. I will try to re-train baseline CAM and then train the CLIMS.

Sierkinhane avatar Mar 12 '23 15:03 Sierkinhane