fast-rcnn How to train rcnn for GoogLeNet

How to train rcnn for GoogLeNet

Open marutiagarwal opened this issue 8 years ago • 36 comments

I have used the following proto file: https://github.com/BVLC/caffe/blob/master/models/bvlc_googlenet/train_val.prototxt

and create this train.prototxt for 33 output classes (including 1 for background). But as soon as I run the tools/train_net.py, I get the following error:

insert_splits.cpp:35] Unknown blob input label to layer 1

Can you please have a quick look at the modified train.prototxt provided below:

name: "GoogleNet" layer { name: 'data' type: 'Python' top: 'data' top: 'rois' top: 'labels' top: 'bbox_targets' top: 'bbox_loss_weights' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes': 33" } } layer { name: "conv1/7x7_s2" type: "Convolution" bottom: "data" top: "conv1/7x7_s2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 3 kernel_size: 7 stride: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "conv1/relu_7x7" type: "ReLU" bottom: "conv1/7x7_s2" top: "conv1/7x7_s2" } layer { name: "pool1/3x3_s2" type: "Pooling" bottom: "conv1/7x7_s2" top: "pool1/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "pool1/norm1" type: "LRN" bottom: "pool1/3x3_s2" top: "pool1/norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv2/3x3_reduce" type: "Convolution" bottom: "pool1/norm1" top: "conv2/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "conv2/relu_3x3_reduce" type: "ReLU" bottom: "conv2/3x3_reduce" top: "conv2/3x3_reduce" } layer { name: "conv2/3x3" type: "Convolution" bottom: "conv2/3x3_reduce" top: "conv2/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "conv2/relu_3x3" type: "ReLU" bottom: "conv2/3x3" top: "conv2/3x3" } layer { name: "conv2/norm2" type: "LRN" bottom: "conv2/3x3" top: "conv2/norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool2/3x3_s2" type: "Pooling" bottom: "conv2/norm2" top: "pool2/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "inception_3a/1x1" type: "Convolution" bottom: "pool2/3x3_s2" top: "inception_3a/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_1x1" type: "ReLU" bottom: "inception_3a/1x1" top: "inception_3a/1x1" } layer { name: "inception_3a/3x3_reduce" type: "Convolution" bottom: "pool2/3x3_s2" top: "inception_3a/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_3x3_reduce" type: "ReLU" bottom: "inception_3a/3x3_reduce" top: "inception_3a/3x3_reduce" } layer { name: "inception_3a/3x3" type: "Convolution" bottom: "inception_3a/3x3_reduce" top: "inception_3a/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_3x3" type: "ReLU" bottom: "inception_3a/3x3" top: "inception_3a/3x3" } layer { name: "inception_3a/5x5_reduce" type: "Convolution" bottom: "pool2/3x3_s2" top: "inception_3a/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 16 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_5x5_reduce" type: "ReLU" bottom: "inception_3a/5x5_reduce" top: "inception_3a/5x5_reduce" } layer { name: "inception_3a/5x5" type: "Convolution" bottom: "inception_3a/5x5_reduce" top: "inception_3a/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_5x5" type: "ReLU" bottom: "inception_3a/5x5" top: "inception_3a/5x5" } layer { name: "inception_3a/pool" type: "Pooling" bottom: "pool2/3x3_s2" top: "inception_3a/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_3a/pool_proj" type: "Convolution" bottom: "inception_3a/pool" top: "inception_3a/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3a/relu_pool_proj" type: "ReLU" bottom: "inception_3a/pool_proj" top: "inception_3a/pool_proj" } layer { name: "inception_3a/output" type: "Concat" bottom: "inception_3a/1x1" bottom: "inception_3a/3x3" bottom: "inception_3a/5x5" bottom: "inception_3a/pool_proj" top: "inception_3a/output" } layer { name: "inception_3b/1x1" type: "Convolution" bottom: "inception_3a/output" top: "inception_3b/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_1x1" type: "ReLU" bottom: "inception_3b/1x1" top: "inception_3b/1x1" } layer { name: "inception_3b/3x3_reduce" type: "Convolution" bottom: "inception_3a/output" top: "inception_3b/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_3x3_reduce" type: "ReLU" bottom: "inception_3b/3x3_reduce" top: "inception_3b/3x3_reduce" } layer { name: "inception_3b/3x3" type: "Convolution" bottom: "inception_3b/3x3_reduce" top: "inception_3b/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_3x3" type: "ReLU" bottom: "inception_3b/3x3" top: "inception_3b/3x3" } layer { name: "inception_3b/5x5_reduce" type: "Convolution" bottom: "inception_3a/output" top: "inception_3b/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_5x5_reduce" type: "ReLU" bottom: "inception_3b/5x5_reduce" top: "inception_3b/5x5_reduce" } layer { name: "inception_3b/5x5" type: "Convolution" bottom: "inception_3b/5x5_reduce" top: "inception_3b/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_5x5" type: "ReLU" bottom: "inception_3b/5x5" top: "inception_3b/5x5" } layer { name: "inception_3b/pool" type: "Pooling" bottom: "inception_3a/output" top: "inception_3b/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_3b/pool_proj" type: "Convolution" bottom: "inception_3b/pool" top: "inception_3b/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_3b/relu_pool_proj" type: "ReLU" bottom: "inception_3b/pool_proj" top: "inception_3b/pool_proj" } layer { name: "inception_3b/output" type: "Concat" bottom: "inception_3b/1x1" bottom: "inception_3b/3x3" bottom: "inception_3b/5x5" bottom: "inception_3b/pool_proj" top: "inception_3b/output" } layer { name: "pool3/3x3_s2" type: "Pooling" bottom: "inception_3b/output" top: "pool3/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "inception_4a/1x1" type: "Convolution" bottom: "pool3/3x3_s2" top: "inception_4a/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_1x1" type: "ReLU" bottom: "inception_4a/1x1" top: "inception_4a/1x1" } layer { name: "inception_4a/3x3_reduce" type: "Convolution" bottom: "pool3/3x3_s2" top: "inception_4a/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_3x3_reduce" type: "ReLU" bottom: "inception_4a/3x3_reduce" top: "inception_4a/3x3_reduce" } layer { name: "inception_4a/3x3" type: "Convolution" bottom: "inception_4a/3x3_reduce" top: "inception_4a/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 208 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_3x3" type: "ReLU" bottom: "inception_4a/3x3" top: "inception_4a/3x3" } layer { name: "inception_4a/5x5_reduce" type: "Convolution" bottom: "pool3/3x3_s2" top: "inception_4a/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 16 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_5x5_reduce" type: "ReLU" bottom: "inception_4a/5x5_reduce" top: "inception_4a/5x5_reduce" } layer { name: "inception_4a/5x5" type: "Convolution" bottom: "inception_4a/5x5_reduce" top: "inception_4a/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 48 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_5x5" type: "ReLU" bottom: "inception_4a/5x5" top: "inception_4a/5x5" } layer { name: "inception_4a/pool" type: "Pooling" bottom: "pool3/3x3_s2" top: "inception_4a/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4a/pool_proj" type: "Convolution" bottom: "inception_4a/pool" top: "inception_4a/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4a/relu_pool_proj" type: "ReLU" bottom: "inception_4a/pool_proj" top: "inception_4a/pool_proj" } layer { name: "inception_4a/output" type: "Concat" bottom: "inception_4a/1x1" bottom: "inception_4a/3x3" bottom: "inception_4a/5x5" bottom: "inception_4a/pool_proj" top: "inception_4a/output" } layer { name: "loss1/ave_pool" type: "Pooling" bottom: "inception_4a/output" top: "loss1/ave_pool" pooling_param { pool: AVE kernel_size: 5 stride: 3 } } layer { name: "loss1/conv" type: "Convolution" bottom: "loss1/ave_pool" top: "loss1/conv" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss1/relu_conv" type: "ReLU" bottom: "loss1/conv" top: "loss1/conv" } layer { name: "loss1/fc" type: "InnerProduct" bottom: "loss1/conv" top: "loss1/fc" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss1/relu_fc" type: "ReLU" bottom: "loss1/fc" top: "loss1/fc" } layer { name: "loss1/drop_fc" type: "Dropout" bottom: "loss1/fc" top: "loss1/fc" dropout_param { dropout_ratio: 0.7 } } layer { name: "loss1/classifier" type: "InnerProduct" bottom: "loss1/fc" top: "loss1/classifier" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1000 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "loss1/loss" type: "SoftmaxWithLoss" bottom: "loss1/classifier" bottom: "label" top: "loss1/loss1" loss_weight: 0.3 } layer { name: "loss1/top-1" type: "Accuracy" bottom: "loss1/classifier" bottom: "label" top: "loss1/top-1" include { phase: TEST } } layer { name: "loss1/top-5" type: "Accuracy" bottom: "loss1/classifier" bottom: "label" top: "loss1/top-5" include { phase: TEST } accuracy_param { top_k: 5 } } layer { name: "inception_4b/1x1" type: "Convolution" bottom: "inception_4a/output" top: "inception_4b/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 160 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_1x1" type: "ReLU" bottom: "inception_4b/1x1" top: "inception_4b/1x1" } layer { name: "inception_4b/3x3_reduce" type: "Convolution" bottom: "inception_4a/output" top: "inception_4b/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 112 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_3x3_reduce" type: "ReLU" bottom: "inception_4b/3x3_reduce" top: "inception_4b/3x3_reduce" } layer { name: "inception_4b/3x3" type: "Convolution" bottom: "inception_4b/3x3_reduce" top: "inception_4b/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 224 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_3x3" type: "ReLU" bottom: "inception_4b/3x3" top: "inception_4b/3x3" } layer { name: "inception_4b/5x5_reduce" type: "Convolution" bottom: "inception_4a/output" top: "inception_4b/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 24 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_5x5_reduce" type: "ReLU" bottom: "inception_4b/5x5_reduce" top: "inception_4b/5x5_reduce" } layer { name: "inception_4b/5x5" type: "Convolution" bottom: "inception_4b/5x5_reduce" top: "inception_4b/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_5x5" type: "ReLU" bottom: "inception_4b/5x5" top: "inception_4b/5x5" } layer { name: "inception_4b/pool" type: "Pooling" bottom: "inception_4a/output" top: "inception_4b/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4b/pool_proj" type: "Convolution" bottom: "inception_4b/pool" top: "inception_4b/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4b/relu_pool_proj" type: "ReLU" bottom: "inception_4b/pool_proj" top: "inception_4b/pool_proj" } layer { name: "inception_4b/output" type: "Concat" bottom: "inception_4b/1x1" bottom: "inception_4b/3x3" bottom: "inception_4b/5x5" bottom: "inception_4b/pool_proj" top: "inception_4b/output" } layer { name: "inception_4c/1x1" type: "Convolution" bottom: "inception_4b/output" top: "inception_4c/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_1x1" type: "ReLU" bottom: "inception_4c/1x1" top: "inception_4c/1x1" } layer { name: "inception_4c/3x3_reduce" type: "Convolution" bottom: "inception_4b/output" top: "inception_4c/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_3x3_reduce" type: "ReLU" bottom: "inception_4c/3x3_reduce" top: "inception_4c/3x3_reduce" } layer { name: "inception_4c/3x3" type: "Convolution" bottom: "inception_4c/3x3_reduce" top: "inception_4c/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_3x3" type: "ReLU" bottom: "inception_4c/3x3" top: "inception_4c/3x3" } layer { name: "inception_4c/5x5_reduce" type: "Convolution" bottom: "inception_4b/output" top: "inception_4c/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 24 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_5x5_reduce" type: "ReLU" bottom: "inception_4c/5x5_reduce" top: "inception_4c/5x5_reduce" } layer { name: "inception_4c/5x5" type: "Convolution" bottom: "inception_4c/5x5_reduce" top: "inception_4c/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_5x5" type: "ReLU" bottom: "inception_4c/5x5" top: "inception_4c/5x5" } layer { name: "inception_4c/pool" type: "Pooling" bottom: "inception_4b/output" top: "inception_4c/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4c/pool_proj" type: "Convolution" bottom: "inception_4c/pool" top: "inception_4c/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4c/relu_pool_proj" type: "ReLU" bottom: "inception_4c/pool_proj" top: "inception_4c/pool_proj" } layer { name: "inception_4c/output" type: "Concat" bottom: "inception_4c/1x1" bottom: "inception_4c/3x3" bottom: "inception_4c/5x5" bottom: "inception_4c/pool_proj" top: "inception_4c/output" } layer { name: "inception_4d/1x1" type: "Convolution" bottom: "inception_4c/output" top: "inception_4d/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 112 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_1x1" type: "ReLU" bottom: "inception_4d/1x1" top: "inception_4d/1x1" } layer { name: "inception_4d/3x3_reduce" type: "Convolution" bottom: "inception_4c/output" top: "inception_4d/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 144 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_3x3_reduce" type: "ReLU" bottom: "inception_4d/3x3_reduce" top: "inception_4d/3x3_reduce" } layer { name: "inception_4d/3x3" type: "Convolution" bottom: "inception_4d/3x3_reduce" top: "inception_4d/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 288 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_3x3" type: "ReLU" bottom: "inception_4d/3x3" top: "inception_4d/3x3" } layer { name: "inception_4d/5x5_reduce" type: "Convolution" bottom: "inception_4c/output" top: "inception_4d/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_5x5_reduce" type: "ReLU" bottom: "inception_4d/5x5_reduce" top: "inception_4d/5x5_reduce" } layer { name: "inception_4d/5x5" type: "Convolution" bottom: "inception_4d/5x5_reduce" top: "inception_4d/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_5x5" type: "ReLU" bottom: "inception_4d/5x5" top: "inception_4d/5x5" } layer { name: "inception_4d/pool" type: "Pooling" bottom: "inception_4c/output" top: "inception_4d/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4d/pool_proj" type: "Convolution" bottom: "inception_4d/pool" top: "inception_4d/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4d/relu_pool_proj" type: "ReLU" bottom: "inception_4d/pool_proj" top: "inception_4d/pool_proj" } layer { name: "inception_4d/output" type: "Concat" bottom: "inception_4d/1x1" bottom: "inception_4d/3x3" bottom: "inception_4d/5x5" bottom: "inception_4d/pool_proj" top: "inception_4d/output" } layer { name: "loss2/ave_pool" type: "Pooling" bottom: "inception_4d/output" top: "loss2/ave_pool" pooling_param { pool: AVE kernel_size: 5 stride: 3 } } layer { name: "loss2/conv" type: "Convolution" bottom: "loss2/ave_pool" top: "loss2/conv" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss2/relu_conv" type: "ReLU" bottom: "loss2/conv" top: "loss2/conv" } layer { name: "loss2/fc" type: "InnerProduct" bottom: "loss2/conv" top: "loss2/fc" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "loss2/relu_fc" type: "ReLU" bottom: "loss2/fc" top: "loss2/fc" } layer { name: "loss2/drop_fc" type: "Dropout" bottom: "loss2/fc" top: "loss2/fc" dropout_param { dropout_ratio: 0.7 } } layer { name: "loss2/classifier" type: "InnerProduct" bottom: "loss2/fc" top: "loss2/classifier" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1000 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "loss2/loss" type: "SoftmaxWithLoss" bottom: "loss2/classifier" bottom: "label" top: "loss2/loss1" loss_weight: 0.3 } layer { name: "loss2/top-1" type: "Accuracy" bottom: "loss2/classifier" bottom: "label" top: "loss2/top-1" include { phase: TEST } } layer { name: "loss2/top-5" type: "Accuracy" bottom: "loss2/classifier" bottom: "label" top: "loss2/top-5" include { phase: TEST } accuracy_param { top_k: 5 } } layer { name: "inception_4e/1x1" type: "Convolution" bottom: "inception_4d/output" top: "inception_4e/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_1x1" type: "ReLU" bottom: "inception_4e/1x1" top: "inception_4e/1x1" } layer { name: "inception_4e/3x3_reduce" type: "Convolution" bottom: "inception_4d/output" top: "inception_4e/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 160 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_3x3_reduce" type: "ReLU" bottom: "inception_4e/3x3_reduce" top: "inception_4e/3x3_reduce" } layer { name: "inception_4e/3x3" type: "Convolution" bottom: "inception_4e/3x3_reduce" top: "inception_4e/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 320 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_3x3" type: "ReLU" bottom: "inception_4e/3x3" top: "inception_4e/3x3" } layer { name: "inception_4e/5x5_reduce" type: "Convolution" bottom: "inception_4d/output" top: "inception_4e/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_5x5_reduce" type: "ReLU" bottom: "inception_4e/5x5_reduce" top: "inception_4e/5x5_reduce" } layer { name: "inception_4e/5x5" type: "Convolution" bottom: "inception_4e/5x5_reduce" top: "inception_4e/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_5x5" type: "ReLU" bottom: "inception_4e/5x5" top: "inception_4e/5x5" } layer { name: "inception_4e/pool" type: "Pooling" bottom: "inception_4d/output" top: "inception_4e/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_4e/pool_proj" type: "Convolution" bottom: "inception_4e/pool" top: "inception_4e/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_4e/relu_pool_proj" type: "ReLU" bottom: "inception_4e/pool_proj" top: "inception_4e/pool_proj" } layer { name: "inception_4e/output" type: "Concat" bottom: "inception_4e/1x1" bottom: "inception_4e/3x3" bottom: "inception_4e/5x5" bottom: "inception_4e/pool_proj" top: "inception_4e/output" } layer { name: "pool4/3x3_s2" type: "Pooling" bottom: "inception_4e/output" top: "pool4/3x3_s2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "inception_5a/1x1" type: "Convolution" bottom: "pool4/3x3_s2" top: "inception_5a/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_1x1" type: "ReLU" bottom: "inception_5a/1x1" top: "inception_5a/1x1" } layer { name: "inception_5a/3x3_reduce" type: "Convolution" bottom: "pool4/3x3_s2" top: "inception_5a/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 160 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_3x3_reduce" type: "ReLU" bottom: "inception_5a/3x3_reduce" top: "inception_5a/3x3_reduce" } layer { name: "inception_5a/3x3" type: "Convolution" bottom: "inception_5a/3x3_reduce" top: "inception_5a/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 320 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_3x3" type: "ReLU" bottom: "inception_5a/3x3" top: "inception_5a/3x3" } layer { name: "inception_5a/5x5_reduce" type: "Convolution" bottom: "pool4/3x3_s2" top: "inception_5a/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 32 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_5x5_reduce" type: "ReLU" bottom: "inception_5a/5x5_reduce" top: "inception_5a/5x5_reduce" } layer { name: "inception_5a/5x5" type: "Convolution" bottom: "inception_5a/5x5_reduce" top: "inception_5a/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_5x5" type: "ReLU" bottom: "inception_5a/5x5" top: "inception_5a/5x5" } layer { name: "inception_5a/pool" type: "Pooling" bottom: "pool4/3x3_s2" top: "inception_5a/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_5a/pool_proj" type: "Convolution" bottom: "inception_5a/pool" top: "inception_5a/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5a/relu_pool_proj" type: "ReLU" bottom: "inception_5a/pool_proj" top: "inception_5a/pool_proj" } layer { name: "inception_5a/output" type: "Concat" bottom: "inception_5a/1x1" bottom: "inception_5a/3x3" bottom: "inception_5a/5x5" bottom: "inception_5a/pool_proj" top: "inception_5a/output" } layer { name: "inception_5b/1x1" type: "Convolution" bottom: "inception_5a/output" top: "inception_5b/1x1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_1x1" type: "ReLU" bottom: "inception_5b/1x1" top: "inception_5b/1x1" } layer { name: "inception_5b/3x3_reduce" type: "Convolution" bottom: "inception_5a/output" top: "inception_5b/3x3_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 192 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_3x3_reduce" type: "ReLU" bottom: "inception_5b/3x3_reduce" top: "inception_5b/3x3_reduce" } layer { name: "inception_5b/3x3" type: "Convolution" bottom: "inception_5b/3x3_reduce" top: "inception_5b/3x3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_3x3" type: "ReLU" bottom: "inception_5b/3x3" top: "inception_5b/3x3" } layer { name: "inception_5b/5x5_reduce" type: "Convolution" bottom: "inception_5a/output" top: "inception_5b/5x5_reduce" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 48 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_5x5_reduce" type: "ReLU" bottom: "inception_5b/5x5_reduce" top: "inception_5b/5x5_reduce" } layer { name: "inception_5b/5x5" type: "Convolution" bottom: "inception_5b/5x5_reduce" top: "inception_5b/5x5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 pad: 2 kernel_size: 5 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_5x5" type: "ReLU" bottom: "inception_5b/5x5" top: "inception_5b/5x5" } layer { name: "inception_5b/pool" type: "Pooling" bottom: "inception_5a/output" top: "inception_5b/pool" pooling_param { pool: MAX kernel_size: 3 stride: 1 pad: 1 } } layer { name: "inception_5b/pool_proj" type: "Convolution" bottom: "inception_5b/pool" top: "inception_5b/pool_proj" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 128 kernel_size: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.2 } } } layer { name: "inception_5b/relu_pool_proj" type: "ReLU" bottom: "inception_5b/pool_proj" top: "inception_5b/pool_proj" } layer { name: "inception_5b/output" type: "Concat" bottom: "inception_5b/1x1" bottom: "inception_5b/3x3" bottom: "inception_5b/5x5" bottom: "inception_5b/pool_proj" top: "inception_5b/output" } layer { name: "pool5/7x7_s1" type: "Pooling" bottom: "inception_5b/output" top: "pool5/7x7_s1" pooling_param { pool: AVE kernel_size: 7 stride: 1 } } layer { name: "pool5/drop_7x7_s1" type: "Dropout" bottom: "pool5/7x7_s1" top: "pool5/7x7_s1" dropout_param { dropout_ratio: 0.4 } } layer { name: "cls_score" type: "InnerProduct" bottom: "pool5/7x7_s1" top: "cls_score" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 33 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "bbox_pred" type: "InnerProduct" bottom: "pool5/7x7_s1" top: "bbox_pred" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 132 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } } layer { name: "loss_cls" type: "SoftmaxWithLoss" bottom: "cls_score" bottom: "labels" top: "loss_cls" loss_weight: 1 } layer { name: "loss_bbox" type: "SmoothL1Loss" bottom: "bbox_pred" bottom: "bbox_targets" bottom: "bbox_loss_weights" top: "loss_bbox" loss_weight: 1 }

Jul 13 '15 09:07 marutiagarwal

@marutiagarwal actually here you don't use RoI Pooling layer. If you see the .prototxt for VGG or CaffeNet, they must contain this layer.

layer { name: "roi_pool5" type: "ROIPooling" bottom: "conv5_3" bottom: "rois" top: "pool5" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } }

Aug 21 '15 15:08 grib0ed0v

@AlexGruzdev I have a question. Why the top connection is linked to pool5 instead of or_pool5? I don't see there is a pool5 layer in the prototxt file. Thank you very much~

Sep 18 '15 20:09 hermitman

@hermitman "pool5" - just a name of an output blob. The next layer have as bottom this "pool5" blob: ... layer { name: "inception_5b/output" type: "Concat" bottom: "inception_5b/1x1" bottom: "inception_5b/3x3" bottom: "inception_5b/5x5" bottom: "inception_5b/pool_proj" top: "inception_5b/output" } layer { name: "roi_pool5" type: "ROIPooling" bottom: "inception_5b/output" bottom: "rois" top: "pool5" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } ...

Sep 21 '15 07:09 grib0ed0v

layer { name: "conv1/7x7_s2" type: "Convolution" bottom: "data" top: "conv1/7x7_s2" param { lr_mult: 1 decay_mult: 1 } In /fast-rcnn/models/VGG16(and CaffeNet) /train.prototxt, the parameter lr_mult and decay_mult is 0, so why you set them 1. I still can not really understand the definition of the prototxt. Do you have any references? Thank you very much.

Nov 06 '15 03:11 catsdogone

If you set lr_mult & decay_mult to 0 that actually means that you 'freeze' the layer and weights of this layer will not be updated during training.

Nov 09 '15 09:11 grib0ed0v

I'm trying to use this model also, but running into a problem with the input dimensions and the loss1/fc and loss2/fc InnerProduct layers. I'm not fully understanding how Fast-RCNN is able to use the model when the image input dimensions are variable, as they are. I think the input images are resized (not cropped) in lib/utils/blob.py so that the input dimensions are not 224x224 consistently. They may be 150x224. This doesn't seem to affect the CaffeNet/VGG models, but it is affecting GoogleNet, since it causes varying input sizes to the loss InnerProduct layers.

What is the best way to handle this? Should input images be resized to 224x224 to match the model, and lose the aspect ratio? Or should loss layers be modified? I tried changing loss1/fc to convolutional but it just pushed the problem a little further down. Or should these intermediate loss layers be removed?

Not a Caffe expert by any means, so apologize if these questions seem basic.

Nov 16 '15 17:11 jond55

I got the googLeNet working for fast-rcnn. Though, performance isn't as good as that of VGG16. I am wondering (significantly lower true positive for ~100 classes), why would that be?

googlenet_test_caffe.prototxt.txt googlenet_train_caffe.prototxt.txt

Feb 16 '16 04:02 marutiagarwal

@jond55 I think the input image doesn't need to be resized, the roi_pooling layer is where amazing happens. The input image can be any ratio but I think the minimum size should larger than 224.

Feb 17 '16 03:02 catsdogone

@marutiagarwal roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } Is your spatial_scale compatible with the original model?

Feb 17 '16 03:02 catsdogone

@catsdogone - No. I wasn't sure about its value. So used it as it is from VGG16. Rest everything looks ok. I haven't resized input images. Using their original size only. Although: input_shape { dim: 1 dim: 3 dim: 224 dim: 224 } is present in test.prototxt, to take a crop from input?

Feb 17 '16 03:02 marutiagarwal

@marutiagarwal I think this value is determined by the stride of layers.

Feb 17 '16 04:02 catsdogone

@catsdogone - it's value is 0.0625 for VGG16 and VGG_CNN_M. I don't think this should affect the performance by such a large margin. I am still missing on what is causing such poor performance for googlenet-fastrcnn.

Feb 17 '16 04:02 marutiagarwal

@catsdogone - any thoughts on that?

Feb 18 '16 05:02 marutiagarwal

I got googlenet working on faster rcnn as well, on VOC2007 I got~65% mAP, not as good as VGG.

Feb 19 '16 01:02 201power

@201power - can you please upload your train, test, and solver proto files. May be we can help improving each others' system. Thanks !

Feb 19 '16 01:02 marutiagarwal

@201power - Can you share your train prototxt files? Thanks~

Feb 24 '16 01:02 bultina0

@201power I am also wondering about it

Mar 09 '16 09:03 RalphMao

@rbgirshick could you comment on this? Will highly appreciate that. Thanks.

Mar 29 '16 05:03 marutiagarwal

@marutiagarwal I think the parameter spatial_scale: 0.0625 # 1/16 is very important, since it is the scaling parameter for coordinate projection. For VGG16, if the input images is (224,224), the conv5_3 is (14, 14), so the spatial_scale is 0.0625 #1/16. (224/14=16) I think your layer {name: "roi_pool3" ...} 's roi_pooling_param should be: roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.03125 # 1/32 } since the dim of inception_5b/output is (7,7) when the input image is (224,224)

But, I get a worse accuracy use this setting. May be I was wrong.

Apr 15 '16 08:04 catsdogone

@201power Could you please share your parameters? Will highly appreciate that. Thanks.

Apr 16 '16 09:04 catsdogone

@marutiagarwal or @201power would you guys be able to share your parameters?

Apr 19 '16 06:04 jainanshul

@jainanshul - already shared my protofile, just few posts above.

Apr 29 '16 04:04 marutiagarwal

I have modify the prototxt and get 63.8% mAP in pascal voc 2007. Mainly about the spatial ratio issue and the last pooling issue

May 16 '16 02:05 XiongweiWu

@rbgirshick @marutiagarwal @XiongweiWu Ignore loss1 and loss2 layers, I just add rpn-layers after layer 《name: "inception_5b/output"》 and set "'feat_stride': 32" of layer 《name: 'rpn-data_faster3'>》and layer 《name: 'proposal_faster3'》,about the pooling layer, the parameters are(since there are five 2*2 max-pooling layers): layer { name: "roi_pool5_faster3" type: "ROIPooling" bottom: "inception_5b/output" bottom: "rois_faster3" top: "pool5_faster3" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.03125 # 1/32 } } But my precision was still lower than 10%. Is there something wrong? Thank you for your help.

This is my prototxt: test.txt train.txt

May 30 '16 10:05 catsdogone

A note to anybody who would like to test the prototxt files from @catsdogone. The files assume 8 classes and they need to be edited accordingly if there is a different number of classes (e.g. 21 in Pascal VOC).

Jun 07 '16 09:06 kamadforge

have you successfully trained a Googlenet/Inception network definition that works in conjunction with Fast R-CNN/faster RCNN? @kamadforge @marutiagarwal @jainanshul

Jul 12 '16 06:07 catsdogone

yes ! Just that I got high fp. I think the problem might be coming from the part of having three fc classifiers instead of 1 as in most of the other networks. Since backprop is starting from all of these 3 fc classifier layers. ᐧ

Regards, Maruti Agarwal

On Tue, Jul 12, 2016 at 12:16 PM, catsdogone [email protected] wrote:

have you successfully trained a Googlenet/Inception network definition that works in conjunction with Fast R-CNN/faster RCNN? @kamadforge https://github.com/kamadforge @marutiagarwal https://github.com/marutiagarwal @jainanshul https://github.com/jainanshul

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rbgirshick/fast-rcnn/issues/36#issuecomment-231951659, or mute the thread https://github.com/notifications/unsubscribe/ACPuP6D6hHAJWA9ILr2D36VfQkXmGRSNks5qUzgvgaJpZM4FXUMo .

Jul 12 '16 07:07 marutiagarwal

@marutiagarwal Could you share your parameter setting or experience in your training ？I have try many times with 1 or 3 fc classifiers while all failed, the ap value always just about 2%. Except the "spatial-scale" my net definition is similar with yours. Thank you very much.

Jul 13 '16 00:07 catsdogone

Why is the 1st blob

input: "data" input_shape { dim: 1 dim: 3 dim: 224 dim: 224 }

Why 224 x 224 ? I thought Fast RCNN take big input and propose region later ?

Dec 02 '16 09:12 Soccer-Video-Analysis

I believe that it's rescaled for every image.

Jan 05 '17 13:01 nw89

@marutiagarwal can U share Ur prototxt files I need that for 1 class detecting

Feb 24 '17 03:02 joyivan

@catsdogone Hi, I'm trying to train GoogleNet with rcnn and faced your same low AP problem by using your test/train prototxt. May I know how did you solve this? Many thanks!

May 17 '17 19:05 haoyul

Hi @marutiagarwal, I modified the original googlenet prototxt and it's more similar to @catsdogone's version. Why there's no RPN and RoI proposal layers in yours? Anyone here trained successfully with @catsdogone's prototxt? My training also only get about 0.1 mean AP on pascal 2007. Thanks for any advise!

May 23 '17 14:05 haoyul

I might be late to the party. But anyway, I've trained a bvlc_googlenet based Faster RCNN model with Pascal VOC 2007 dataset. Its mAP is very close to the original VGG16 based Faster RCNN, and it's about twice faster.

I've shared my experience in my blog post. Have a look if you are interested: https://jkjung-avt.github.io/making-frcn-faster/

Mar 30 '18 14:03 jkjung-avt

can any one explain what is spatial_scale please?

Jun 21 '18 00:06 isalirezag

The "spatial_scale" in the ROIPooling layer specifies "how much the input/bottom blob has been scaled down from the original input image". For example, the VGG16 feature extractor would produce a 512x30x40 blob (conv5_3) from a 3x480x640 input image. In such a case we would set "spatial scale" to 0.0625 (or 1/16) since the height and width of ROIPooling's bottom blob is 1/16 of the input image.

Reference: roi_pooling_layer.cpp

Jun 21 '18 07:06 jkjung-avt

fast-rcnn fast-rcnn copied to clipboard

How to train rcnn for GoogLeNet

fast-rcnn
fast-rcnn copied to clipboard