MobileNet-Caffe
MobileNet-Caffe copied to clipboard
why scale: 0.017?
255*0.017 = 4.335, why?
1/0.017=58.8, it is used as the approximated std values for image processing.
@shicai How to calculate the std values?
it has been computed by facebook before. it is [ 58.395, 57.12 , 57.375] for RGB channels. pls see: https://github.com/facebook/fb.resnet.torch/blob/master/datasets/imagenet.lua#L69
@shicai Thx
Hi shicai. Since we set the scale when we train the network, then should we process the images with the scale when we perform classification use classification.cpp and deploy.prototxt? Means that we should add the line scale image with 0.017 in classification.cpp? Thank you very much! @shicai
certainly yes.
Hi @shicai ,
Thank you very much for sharing your implementation!
I just want to confirm that the image preprocessing that I am using is correct,
mu = np.array([103.94, 116.78, 123.68])
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension
transformer.set_mean('data', mu) # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR
transformer.set_input_scale('data', 0.017)
image = caffe.io.load_image(caffe_root + 'examples/images/cat.jpg')
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image
When inferencing the Caffe's example cat image, "caffe/examples/images/cat.jpg", I got the following top-5 classes:
(0.29621762, 'n02123159 tiger cat'),
(0.14746507, 'n02119022 red fox, Vulpes vulpes')
(0.13464026, 'n02119789 kit fox, Vulpes macrotis')
(0.086514398, 'n02113023 Pembroke, Pembroke Welsh corgi')
(0.031484455, 'n02123045 tabby, tabby cat')
The correct answer is the 5th highest prediction -- tabby cat. Do you know if this is expected?
@lannylian I got the results with you
Could we input the training set without scale operation? The most of CNN models process the images without scale operation. If or not scale operation for input images improves the speed of neural network convergence?
yes, you can train from scratch in this way. i think it makes very little difference.
How many epochs do we need for training MobileNet from scratch on ImageNet?
~100 epochs. more is better.
@shicai if i want use this to android ,can i train with gpu?