Incremental-Network-Quantization Effectiveness of Detection Tasks and Tiny Models?

Hi @Zhouaojun , thanks a lot for you awesome work, I've attempted your algorithm on an object detection task (car detection, to be concrete) and it truly works. In my experiment, given the same detection threshold, the performance of the INQ quantized SSD model almost match to the original model in precision, however, in terms of recall, it suffered (recall decreased nearly 3%). Since you didn't mention how does this algorithm work in regression tasks, I want you to share some thought about.

Another thing I found is, it handles well with huge models (say, > 100 MB in parameters), but seems to comparably inadequate to quantize tiny models (weights size < 1 MB). I tried very hard finetuning with different partition schemes but still was unable to recover the full model classification accuracy.

Finally, I did conduct some experiments on activation quantization, but neither ReLU6 nor fixed-point trimming worked for me, will you again shed some light on your Appendix's teaser?

Thank you again.

Jan 16 '18 02:01 power0341

@power0341 Hi, thank for your attention

about detection task， I'm sorry, I am not familiar with detection task, If I have time to do detection task quantization, I will share some result with you
Light network (MobileNet, ShuffleNet) quantization is more difficult than huge network (such as ResNets, VGG), I think you can use more bit width, such as activation 8 bits and weights 6 bits, more careful partition step, such as 0.05 pace, or you can refer to latest google mobilenet quantization paper
activation quantization, for RELU quantization, My solution make the maximum fixed, so you can quantize float number into bins within [0, max]

thanks

Jan 17 '18 07:01 AojunZhou

@power0341 which small architectures did you try and did you do any significant adjustments to any c++ module? Because I tried to just run inference on a pre-trained mobilenet model and I got an accuracy of 0%, while if I ran the same pre-trained mobilenet on standard caffe implementation I got 70%. So there is something very different in the inference phase between INQ caffe and standard caffe implementation. Any suggestions of what I can do? @Zhouaojun

thanks!

Mar 08 '18 10:03 ij10

Hey @ij10 , it sounds like you haven't had finetuned you model, have you?

Mar 09 '18 01:03 power0341

Hi @power0341 , Not sure if I understand what you mean. The model (mobilenet in this case) is pre-trained on imagenet 2012 data. When I run inference (using the standard caffe implementation) with the pre-trained weights on validation data (imagenet 2012) I get an accuracy of around 70%. But when I want to run inference on the same validation data and the same pre-trained weights on the INQ caffe implementation I get 0%. If I'm not mistaken, that means there are some significant changes to the inference procedure in the INQ caffe implementation, but I can't see what exactly.

This is the start of my mobilenet_deploy.prototxt file for mobilenet:

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.017
    mirror: false
    crop_size: 224
    mean_value: [103.94, 116.78, 123.68]
  }
  data_param {
    source: "/path/to/ilsvrc12_val_lmdb/"
    batch_size: 50
    backend: LMDB
  }
}

I use the following command to run inference on validation data (ImageNet 2012, 50K images):

~/caffe/build/tools/caffe test -model /path/to/mobilenet_deploy.prototxt -weights /path/to/mobilenet.caffemodel -gpu 0 -iterations 1000

which gives the 70% in accuracy. Notice that this result is achieved by using the standard caffe implementation.

Then I run the same code but using the the INQ authors modified caffe implementation by running:

~/INQ/build/tools/caffe test -model /path/to/mobilenet_deploy.prototxt -weights /path/to/mobilenet.caffemodel -gpu 0 -iterations 1000

which gives 0 %. How do I get rid of this behavior?

Mar 14 '18 15:03 ij10

@Zhouaojun any ideas?

May 22 '18 13:05 ij10

hi, @ij10 ，have you solved your problem?

Oct 16 '18 07:10 shenlinyao

hi @shenlinyao ! Unfortunately not. Do you have any ideas on why this is the case and how to solve it?

Oct 16 '18 21:10 ij10

Incremental-Network-Quantization Incremental-Network-Quantization copied to clipboard

Effectiveness of Detection Tasks and Tiny Models?

Incremental-Network-Quantization
Incremental-Network-Quantization copied to clipboard