Incremental-Network-Quantization
Incremental-Network-Quantization copied to clipboard
Effectiveness of Detection Tasks and Tiny Models?
Hi @Zhouaojun , thanks a lot for you awesome work, I've attempted your algorithm on an object detection task (car detection, to be concrete) and it truly works. In my experiment, given the same detection threshold, the performance of the INQ quantized SSD model almost match to the original model in precision, however, in terms of recall, it suffered (recall decreased nearly 3%). Since you didn't mention how does this algorithm work in regression tasks, I want you to share some thought about.
Another thing I found is, it handles well with huge models (say, > 100 MB in parameters), but seems to comparably inadequate to quantize tiny models (weights size < 1 MB). I tried very hard finetuning with different partition schemes but still was unable to recover the full model classification accuracy.
Finally, I did conduct some experiments on activation quantization, but neither ReLU6 nor fixed-point trimming worked for me, will you again shed some light on your Appendix's teaser?
Thank you again.
@power0341 Hi, thank for your attention
-
about detection task, I'm sorry, I am not familiar with detection task, If I have time to do detection task quantization, I will share some result with you
-
Light network (MobileNet, ShuffleNet) quantization is more difficult than huge network (such as ResNets, VGG), I think you can use more bit width, such as activation 8 bits and weights 6 bits, more careful partition step, such as 0.05 pace, or you can refer to latest google mobilenet quantization paper
-
activation quantization, for RELU quantization, My solution make the maximum fixed, so you can quantize float number into bins within [0, max]
thanks
@power0341 which small architectures did you try and did you do any significant adjustments to any c++ module? Because I tried to just run inference on a pre-trained mobilenet model and I got an accuracy of 0%, while if I ran the same pre-trained mobilenet on standard caffe implementation I got 70%. So there is something very different in the inference phase between INQ caffe and standard caffe implementation. Any suggestions of what I can do? @Zhouaojun
thanks!
Hey @ij10 , it sounds like you haven't had finetuned you model, have you?
Hi @power0341 , Not sure if I understand what you mean. The model (mobilenet in this case) is pre-trained on imagenet 2012 data. When I run inference (using the standard caffe implementation) with the pre-trained weights on validation data (imagenet 2012) I get an accuracy of around 70%. But when I want to run inference on the same validation data and the same pre-trained weights on the INQ caffe implementation I get 0%. If I'm not mistaken, that means there are some significant changes to the inference procedure in the INQ caffe implementation, but I can't see what exactly.
This is the start of my mobilenet_deploy.prototxt file for mobilenet:
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.017
mirror: false
crop_size: 224
mean_value: [103.94, 116.78, 123.68]
}
data_param {
source: "/path/to/ilsvrc12_val_lmdb/"
batch_size: 50
backend: LMDB
}
}
I use the following command to run inference on validation data (ImageNet 2012, 50K images):
~/caffe/build/tools/caffe test -model /path/to/mobilenet_deploy.prototxt -weights /path/to/mobilenet.caffemodel -gpu 0 -iterations 1000
which gives the 70% in accuracy. Notice that this result is achieved by using the standard caffe implementation.
Then I run the same code but using the the INQ authors modified caffe implementation by running:
~/INQ/build/tools/caffe test -model /path/to/mobilenet_deploy.prototxt -weights /path/to/mobilenet.caffemodel -gpu 0 -iterations 1000
which gives 0 %. How do I get rid of this behavior?
@Zhouaojun any ideas?
hi, @ij10 ,have you solved your problem?
hi @shenlinyao ! Unfortunately not. Do you have any ideas on why this is the case and how to solve it?