light_head_rcnn icon indicating copy to clipboard operation
light_head_rcnn copied to clipboard

the predict procedure don't support multi-batch

Open machanic opened this issue 6 years ago • 11 comments

the predict procedure don't support multi-batch, and we test your code on Nvidia 1080Ti GPU which only achieve 2 images per seconds. Far lower than original paper 100 FPS.

machanic avatar Apr 06 '18 14:04 machanic

Hi, About speed: We test Light-Head with resnet-101 on TITAN XP, it yields 40mAP at 10FPS. Following a common practice, testing time is only counting on network forward~(exclude post-processing). Maybe u need to modify test.py as follows to see a more reasonable time in testing. BTW, Tensorflow is slow at starting up, please check the detection speed after enough testing iteration. My testing log is shown in the attachment. result.txt

diff --git a/experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign/test.py b/experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign/test.py
index a48e336..7f67a4a 100644
--- a/experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign/test.py
+++ b/experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign/test.py
@@ -12,6 +12,7 @@ from config import cfg, config
 
 import argparse
 import dataset
+import time
 import os.path as osp
 import network_desp
 import tensorflow as tf
@@ -64,10 +65,10 @@ def inference(val_func, inputs, data_dict):
 
     feed_dict = {inputs[0]: resized_img[None, :, :, :], inputs[1]: im_info}
 
-    #st = time.time()
+    st = time.time()
     _, scores, pred_boxes, rois = val_func(feed_dict=feed_dict)
-    #ed = time.time()
-    #print(ed -st)
+    ed = time.time()
+    print(ed -st)
 
     boxes = rois[:, 1:5] / scale

About multi-batch testing: I think it is not much hard to modify test.py to support multi-batch testing. BTW, We yield 100FPS with a small backbone network(xception like 145M FLOPs), not ResNet-101 7.6G FLOPs.

zengarden avatar Apr 07 '18 02:04 zengarden

@zengarden I have changed the test.py for multi-batch,but I get a problem. How can I distinguish the output of the net to different image.the output of net don't show which image the box belong to? How can I konw it in multi-batch testing?

MaskVulcan avatar Apr 07 '18 08:04 MaskVulcan

@MaskVulcan The outputs should have the same order with inputs.

zengarden avatar Apr 07 '18 09:04 zengarden

@zengarden For example, I change the test.py and batch_size is 2, but the output of net ,cls_score, is (1001,81).how can I know the order?each of this belong to which image?

MaskVulcan avatar Apr 07 '18 09:04 MaskVulcan

@zengarden oh. I see. Is the rois have the index information?

MaskVulcan avatar Apr 07 '18 09:04 MaskVulcan

@MaskVulcan For multi-batch testing, first you need to set attribute test_batch_per_gpu to number_of_batches(e.g. 2) in config.py. Then modify test.py to provide feed_dict with multi_image. If u finished them, scores and rois should have the shape of [2000, 81], [2000, 5]. The first dimension of rois is its batch_index.

zengarden avatar Apr 07 '18 09:04 zengarden

@zengarden I get this. And I change the batch size to 2.and I find my output is not very good, The shape of rois is always changing. I print it, just like(2000,5),(577,5)(1844,5).......I am a little confused. I copy the code of get_data_for_singlegpu in your dataset.py to feed multi-batch images to test.

MaskVulcan avatar Apr 07 '18 14:04 MaskVulcan

@MaskVulcan Since there are various number of boxes in each image, the shape of the rois may be changeable . The first column of rois is its batch_index, which is helpful for specifying rois of each image.

zengarden avatar Apr 08 '18 04:04 zengarden

@zengarden I have changed the code. I copy the code in your dataset.py for train dataset and use it for test. But I find that when I set batch size to 2, the speed turns to 0.06s/img, map 26. But when I change it to 8, the speed is still 0.06s/img, map turns to 34. It is very strange. Is there any other code I need to change?

MaskVulcan avatar Apr 09 '18 06:04 MaskVulcan

I ran test.py on gtx1070 and got a test speed of 0.18s/img. Is there any problem?

Mikcal avatar Sep 17 '18 03:09 Mikcal

@MaskVulcan have you finished mutil-batch on prediction? when use mutil-batch, the speed is bigger than one-batch? how many?

lxyyang avatar Jan 03 '19 03:01 lxyyang