ACoL icon indicating copy to clipboard operation
ACoL copied to clipboard

The code for localization

Open jianpursuit opened this issue 6 years ago • 13 comments

Is there code for finding b-box and measuring localization accuracy?

jianpursuit avatar Sep 08 '18 01:09 jianpursuit

I use the code from CAM(https://github.com/metalbubble/CAM) for producing boxes and measuring localization accuracy.

xiaomengyc avatar Sep 08 '18 03:09 xiaomengyc

I get, thanks a lot.

jianpursuit avatar Sep 08 '18 07:09 jianpursuit

Hi @xiaomengyc ,

Thanks for sharing your code! As I am running your code, I get a similar classification error on CUB dataset. However, when it comes to the localization error, I get a much higher localization error. So I look into the CAM code, and find there are three threshold values. As I change the threshold value, I can get better results. Did you use the provided thresholds [20, 100, 110], or you used another threshold values? Thanks.

chaoyan1037 avatar Oct 22 '18 01:10 chaoyan1037

Hi @xiaomengyc ,

Thanks for sharing your code! As I am running your code, I get a similar classification error on CUB dataset. However, when it comes to the localization error, I get a much higher localization error. So I look into the CAM code, and find there are three threshold values. As I change the threshold value, I can get better results. Did you use the provided thresholds [20, 100, 110], or you used another threshold values? Thanks.

chaoyan1037 avatar Oct 22 '18 02:10 chaoyan1037

Hi @xiaomengyc ,

Thanks for sharing your code! As I am running your code, I get a similar classification error on CUB dataset. However, when it comes to the localization error, I get a much higher localization error. So I look into the CAM code, and find there are three threshold values. As I change the threshold value, I can get better results. Did you use the provided thresholds [20, 100, 110], or you used another threshold values? Thanks.

I tested different thresholds to obtain the best localization results. I am sorry that I forgot the exact threshold numbers.

xiaomengyc avatar Oct 22 '18 06:10 xiaomengyc

@xiaomengyc Thanks for your kind reply! It is fine. It great to know how you set the threshold values. Thanks again!

chaoyan1037 avatar Oct 22 '18 14:10 chaoyan1037

@xiaomengyc One more question, since I have run your code and tried different thresholds, I can not get the reported localization results, though I did get a similar classification result. So I think the model is well trained. What really matters here is the threshold values. If it possible for you to find out the threshold values in your code? I will be very appreciated if you can do that. Thanks a lot.

chaoyan1037 avatar Oct 25 '18 16:10 chaoyan1037

Actually, when I use ground truth labels to evaluate, I can get the similar localization results as you reported. However, the classification error rate is about ~25, which means the localization error will be much worse than you reported.

chaoyan1037 avatar Oct 25 '18 17:10 chaoyan1037

Hi @chaoyan1037 , Sorry for the missing threshold values. I looked through my experiment records but still did not find these values. It seems I only picked out the best results, since I used a script to run over many possible combinations. The localization results using ground truth labels are fairer for comparison. It makes sense if there are some little difference. Also, the results reported in the paper are obtained in Tensorflow. Maybe this will cause some difference as well.

xiaomengyc avatar Oct 30 '18 11:10 xiaomengyc

Hi @xiaomengyc Thanks for the reply! Your code has helped me to understand this task and thanks a lot for the code!

chaoyan1037 avatar Oct 30 '18 14:10 chaoyan1037

Hi @xiaomengyc , One last question, hope you will not feel annoying. Since I am new to this topic, I am not sure my evaluation method is right. First of all, for each image, we can use CAM to generate multiple bound boxes. Which one do you use for the evaluation? I just select the first one for the evaluation. Secondly, I use the ILSVRC 2012 dataset while you use the ILSVRC 2016. Is there any difference? Thanks for your patience and help!

chaoyan1037 avatar Nov 01 '18 15:11 chaoyan1037

Hi @jianpursuit , @xiaomengyc , @chaoyan1037 ,

I am currently searching for a solution to create bounding boxes on an image where training data does not contain any annotations. Is it possible to use this code for doing the same ?. I am trying to create training data for YOLO .

(PS : I am new to this field and please excuse me if my questions are too vague )

Thanks Rahul

Rahul-Venugopal avatar Dec 04 '18 13:12 Rahul-Venugopal

@xiaomengyc
ACoL is exactly designed for the weakly supervised localization problem. It actually learns a classification network with the supervision of image level labels (like cat or dog). We apply the classification network to generate heatmaps indicating where the target objects most likely appear. After getting heatmaps using our code, you got to produce the bounding boxes using the code rom CAM.

Mainwhile, you can also use our SPG model, where provides the trained model on ImageNet. You can try this model first seeing whether this weakly method fits you needs.

Gook Luck! Xiaolin

xiaomengyc avatar Dec 04 '18 23:12 xiaomengyc