light_head_rcnn
                                
                                 light_head_rcnn copied to clipboard
                                
                                    light_head_rcnn copied to clipboard
                            
                            
                            
                        where is xception like network code which is written in original paper?
where is xception like network code which is written in original paper?
@zengarden, Would you mind to provide the code and model of xception like network? I've try to implemented and train it with Imagenet. The best accuracy I get is around 64%. would you mind share the relative code and experience of that?
Best,
@foreverYoungGitHub I'm reproducing the efficient xception like network in TF. Since the xception network mentioned in paper is built and trained in our internal platform, i need some time to transfer it to TF and ensure the accuracy.
Hi, I have also tried the original Xception backbone in my re-implementation and found it worked too. And the forward time in training for backbone network is about 20ms. But it's maybe better to switch to xception-like for better acc and speed.
@HiKapok Could you please tell me the speed and map in test?
@MaskVulcan I didn't measure the speed in test time, in fact the nms op in TensorFlow is a little slow which is a bottleneck of the full detection pipeline. You can find the codes in my github if you are interested in. But I still recommend you wait for the official one to ensure the speed.
@HiKapok OK,thanks! Would you mind to tell me your map in Xception?
@MaskVulcan 74.5mAP now on pascal 2007 test.
Any update about the xception-like network?
Hi, can someone provide only some insights about the Xception-like model? I think I can implement the training and some other stuff, but I'm not sure if I understood well this model. I have some questions:
- 
is the last layer a FC 1000? because I don't know what is the last layer to connect the light-head R-CNN 
- 
what does this expression (from original paper) mean? ""Following xception design strategies, we replace all convolution layer in bottle-neck structure with channel-wise convolution. However we do not use the pre-activation design which is proposed in identity mappings [10] because of shallow network."" 
- 
Should I replace the 7 convolutional layers(first layer and stage 1,2 and 3) for convolution layers in bottleneck structure with channel-wise convolution? 
- 
is this the large separated convolution? from 785-792 lines https://github.com/terrychenism/Deformable-ConvNets/blob/master/rfcn/symbols/resnet_v1_101_rfcn_light.py#L785 
@HiKapok good job. 74.5mAP is Pascal VOC is inferior to Yolov2 considering Light-Head R-CNN should have a better mAP.
Thank you in advanced guys.
@zengarden How is the tf implementation going? Have you finished it? Or could you please just release the structure of the Xception? I think lots of people are waiting for it. Cheers
@foreverYoungGitHub Hello. Would you mind to tell me your mAP and FPS in MSCOCO using xception like backbone in Light-Head R-CNN? I also tried, but I did not reach 30.7 mAP at 700*1100 size.
how many layers does xception-like backbone in Light-Head R-CNN has? Are they 17 conv layers (based on Table 7) and 2 MLP (from original code)? edited: @geonseoks I computed the mAP using the test.py (cocoapi metric) and I got this:
evaluation epoch 20
loading annotations into memory...
Done (t=1.20s)
creating index...
index created!
Loading and preparing results...
DONE (t=6.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=75.63s).
Accumulating evaluation results...
DONE (t=14.30s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.040
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.085
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.034
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.002
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.019
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.075
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.067
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.086
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.088
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.004
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.036
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.172
I'm obtaining approximately 20-21 FPS using 2 images in testing. and rarely it decreases to 10-12 FPS during testing.
@edgarmedina1801 Yes, I use ConV1~FC on Tabel 7.
And here is my answer.
is the last layer a FC 1000? because I don't know what is the last layer to connect the light-head R-CNN -> I connect Stage 3 Layer to RPN's input and Stage 4 Layer to global context module's input(make stride to 1 at Stage4)
what does this expression (from original paper) mean? ""Following xception design strategies, we replace all convolution layer in bottle-neck structure with channel-wise convolution. However we do not use the pre-activation design which is proposed in identity mappings [10] because of shallow network."" -> I change all 3*3 conv in Stage2~4 to channel-wise conv. And pre-activation design is described in the paper[10]. So you can understand the backbone is Xeption like ResNet. Because ResNet have bottle-neck structure. I think this backbone is similar to Mobilenet V2.
@geonseoks Thanks a lot for elaborating on Xception network. I have another question if you don't mind. When the author says channel wise convolution, is it tf.nn.depthwise_conv2d or tf.nn.separable_conv2d in tensorflow?
@karansomaiah Good question, I use separable_conv2d. I will try to use depthwise_conv2d later.
Hey @geonseoks, I tried implementing with the information you provided but I get a lot of nan values in the loss very early in training. I even checked your code:
- I still face the same issue
- In line 157 where proposal_opris applied, I cannot useis_tfnms=Falsesince it gives me an error sayinglib_kernel.lib_fast_nmsnot found. I checked further, and it seems @zengarden has removed it in the latest commit. Any help will be appriciated
Thanks in advance.
Do you guys change the spatial_scale argument for PSAlign code in your network description for xception network definition? @geonseoks @edgarmedina1801 @HiKapok