Faster-RCNN_TF icon indicating copy to clipboard operation
Faster-RCNN_TF copied to clipboard

Low and weird mAP results on new dataset

Open hadign20 opened this issue 7 years ago • 16 comments

I have trained faster rcnn on a new dataset based on pascal_VOC format. The configurations are like pascal_VOC but there are 9 classes and the training is done with 10000 iterations. This is the results:

AP for car = 0.5931
AP for truck = 0.3333
AP for tractor = 0.3470
AP for campingcar = 0.4091
AP for van = 0.5000
AP for other = 0.1983
AP for pickup = 0.3439
AP for boat = 0.0769
AP for plane = -1.0000
Mean AP = 0.2002

As you can see the AP values are very low, and for objects of type plane , it's -1. What can cause this? Should I change anything else to improve the AP values?

hadign20 avatar Jun 04 '17 13:06 hadign20

The -1 may come from the absence of any plane in your test set. This case actually hurts your mean AP more than it should, since it is taken into the average, your mean AP is about .31 if you count 0 for planes. Maybe you don't have enough data, or did not train enough time, or should try changing some hyperparametes (e.g. learning rate)

gdelab avatar Jun 14 '17 13:06 gdelab

@hadi-ghnd hello,can you tell me where is the result,thanks

shiwenhao avatar Jun 18 '17 07:06 shiwenhao

@shiwenhao after the training is done, you can see the results in the terminal. Also you can make a new folder called logs in experiments folder, so the logs from training and the results are stored there.

hadign20 avatar Jun 18 '17 18:06 hadign20

@hadi-ghnd in the terminal, I do not see the results, only have follow these: Wrote snapshot to: /home/qz/Faster-RCNN_TF/output/faster_rcnn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_70000.ckpt done solving

real 222m23.290s user 766m26.504s sys 66m26.240s

  • set +x
  • python ./tools/test_net.py --device gpu --device_id 2 --weights /home/qz/Faster-RCNN_TF/output/faster_rcnn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_70000.ckpt --imdb voc_2007_test --cfg experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_test voc_2007_train voc_2007_val voc_2007_trainval voc_2007_test kitti_train kitti_val kitti_trainval kitti_test nthu_71 nthu_370 Called with args: Namespace(cfg_file='experiments/cfgs/faster_rcnn_end2end.yml', comp_mode=False, device='gpu', device_id=2, imdb_name='voc_2007_test', model='/home/qz/Faster-RCNN_TF/output/faster_rcnn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_70000.ckpt', network_name='VGGnet_test', prototxt=None, wait=True) Using config: {'DATA_DIR': '/home/qz/Faster-RCNN_TF/data', 'DEDUP_BOXES': 0.0625, 'EPS': 1e-14, 'EXP_DIR': 'faster_rcnn_end2end', 'GPU_ID': 0, 'IS_MULTISCALE': False, 'MATLAB': 'matlab', 'MODELS_DIR': '/home/qz/Faster-RCNN_TF/models/pascal_voc', 'PIXEL_MEANS': array([[[ 102.9801, 115.9465, 122.7717]]]), 'RNG_SEED': 3, 'ROOT_DIR': '/home/qz/Faster-RCNN_TF', 'TEST': {'BBOX_REG': True, 'DEBUG_TIMELINE': False, 'HAS_RPN': True, 'MAX_SIZE': 1920, 'NMS': 0.3, 'PROPOSAL_METHOD': 'selective_search', 'RPN_MIN_SIZE': 16, 'RPN_NMS_THRESH': 0.7, 'RPN_POST_NMS_TOP_N': 1200, 'RPN_PRE_NMS_TOP_N': 6000, 'SCALES': [900], 'SVM': False}, 'TRAIN': {'ASPECT_GROUPING': True, 'BATCH_SIZE': 128, 'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0], 'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0], 'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2], 'BBOX_NORMALIZE_TARGETS': True, 'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True, 'BBOX_REG': True, 'BBOX_THRESH': 0.5, 'BG_THRESH_HI': 0.5, 'BG_THRESH_LO': 0.0, 'DEBUG_TIMELINE': False, 'DISPLAY': 10, 'FG_FRACTION': 0.25, 'FG_THRESH': 0.5, 'GAMMA': 0.1, 'HAS_RPN': True, 'IMS_PER_BATCH': 1, 'LEARNING_RATE': 0.001, 'MAX_SIZE': 1000, 'MOMENTUM': 0.9, 'PROPOSAL_METHOD': 'gt', 'RPN_BATCHSIZE': 256, 'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0], 'RPN_CLOBBER_POSITIVES': False, 'RPN_FG_FRACTION': 0.5, 'RPN_MIN_SIZE': 16, 'RPN_NEGATIVE_OVERLAP': 0.3, 'RPN_NMS_THRESH': 0.7, 'RPN_POSITIVE_OVERLAP': 0.7, 'RPN_POSITIVE_WEIGHT': -1.0, 'RPN_POST_NMS_TOP_N': 2000, 'RPN_PRE_NMS_TOP_N': 12000, 'SCALES': [600], 'SNAPSHOT_INFIX': '', 'SNAPSHOT_ITERS': 5000, 'SNAPSHOT_PREFIX': 'VGGnet_fast_rcnn', 'STEPSIZE': 50000, 'USE_FLIPPED': True, 'USE_PREFETCH': False}, 'USE_GPU_NMS': True} Waiting for /home/qz/Faster-RCNN_TF/output/faster_rcnn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_70000.ckpt to exist... Besides,in logs folder,there is a "faster_rcnn_end2end_VGG16_.txt.2017-06-19_09-43-13" file,the content of file is same in the terminal,how I need to do??

shiwenhao avatar Jun 19 '17 06:06 shiwenhao

@shiwenhao your testing is not happening because you can't load the checkpoint: see #161, with the solution in #79.

gdelab avatar Jun 19 '17 07:06 gdelab

@shiwenhao if it gets stuck at Waiting for /home/qz/Faster-RCNN_TF/output/faster_rcnn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_70000.ckpt to exist... try changing 2 files based on this and this pages. If it didn't help try other solutions that @gdelab suggested.

hadign20 avatar Jun 19 '17 07:06 hadign20

@hadi-ghnd @gdelab after changing , I have a new problem:cudaCheckError() failed : invalid resource handle, I have changed 'lib/make.sh' sm_37 to sm_60,sm_61,,it did not work.My GPU is GTX 1080Ti.

shiwenhao avatar Jun 19 '17 12:06 shiwenhao

I don't know how to solve that, you should probably close this issue and start a new one with your new problem, I don't think they're related...

gdelab avatar Jun 19 '17 14:06 gdelab

@shiwenhao I think you should change lib/setup.py as well and then run make again. Also you can check cuda and cudnn versions to see if they're high enough. If this doesn't help, I think you should create a new issue as @gdelab says.

hadign20 avatar Jun 19 '17 16:06 hadign20

@hadi-ghnd Modify GPUid from 1 to 0,I solved it ... unbelieveable, I used GPUid = 1,2,3, but not 0,I do not know why is it. Thank you very much for your help.

shiwenhao avatar Jun 20 '17 02:06 shiwenhao

@shiwenhao me too, I modify GPU ID from 2 to 0, and it can work. It's really unbelievable. Do you know why now?

luwanxuan avatar Nov 14 '17 13:11 luwanxuan

@luwanxuan I don‘t know why...I find four GPU memory is occupied, but only one is in the calculation.I guess it's the characteristic of TensorFlow.

shiwenhao avatar Nov 15 '17 06:11 shiwenhao

@hadi-ghnd do you know why AP=-1?i have the same problem?

hxf930620 avatar May 03 '18 10:05 hxf930620

@shiwenhao In what file should we change this setting? I trained 5 epochs and the maps for all classes were 0, and then I trained the network for another 5 epochs, and now it give -1 map for all classes. Would you know what is going wrong? I am training on a custom data-set with 4 classes(3+1 for background). I made all the necessary changed according to the VOC data-set.

I would be really appreciate some help. Thank you!

lipikak17 avatar Jul 10 '18 10:07 lipikak17

@hadi-ghnd could you please explain how to debug this problem of -1 MAP?

lipikak17 avatar Jul 17 '18 07:07 lipikak17

@luwanxuan could you please share in which file to change the GPUId? thank you in advance! :)

lipikak17 avatar Aug 13 '18 12:08 lipikak17