channel-pruning icon indicating copy to clipboard operation
channel-pruning copied to clipboard

./finetune.sh 0 with no output and finished within one second

Open cicido opened this issue 6 years ago • 18 comments

What hardware and operating system/distribution are you running?

Operating system: CentOS7 CUDA version: 8.0 CUDNN version: .6.0 openCV version: 3.0 BLAS: open Python version: 3.6.3 boost 1.65.1 boost-python: 1.65.1 with python3 support

after run ./combine.sh | xargs ./calflop.sh. I get the temp/cb_3c_3C4x_mem_bn_vgg.prototxt: name: "VGG_ILSVRC_16_layers" layer { name: "data" type: "Data" top: "data" top: "label" transform_param { mirror: false crop_size: 224 mean_value: 104.0 mean_value: 117.0 mean_value: 123.0 } data_param { source: "/home/duanxiping/channel-pruning/caffe/examples/imagenet/ilsvrc12_val_lmdb" batch_size: 32 backend: LMDB } image_data_param { source: "/data/mydata/sourcetrain.txt" batch_size: 32 shuffle: true new_dim: 256 bicubic: true } } layer { name: "conv1_1" type: "Convolution" bottom: "data" top: "conv1_1" convolution_param { num_output: 64 pad: 1 kernel_size: 3 } } layer { name: "relu1_1" type: "ReLU" bottom: "conv1_1" top: "conv1_1" } layer { name: "conv1_2_V" type: "Convolution" bottom: "conv1_1" top: "conv1_2_V" convolution_param { num_output: 22 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv1_2_H" type: "Convolution" bottom: "conv1_2_V" top: "conv1_2_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 22 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv1_2_P" type: "Convolution" bottom: "conv1_2_H" top: "conv1_2_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 57 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu1_2" type: "ReLU" bottom: "conv1_2_P" top: "conv1_2_P" } layer { name: "pool1" type: "Pooling" bottom: "conv1_2_P" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv2_1_V" type: "Convolution" bottom: "pool1" top: "conv2_1_V" convolution_param { num_output: 49 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv2_1_H" type: "Convolution" bottom: "conv2_1_V" top: "conv2_1_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 49 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv2_1_P" type: "Convolution" bottom: "conv2_1_H" top: "conv2_1_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 117 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2_1" type: "ReLU" bottom: "conv2_1_P" top: "conv2_1_P" } layer { name: "conv2_2_V" type: "Convolution" bottom: "conv2_1_P" top: "conv2_2_V" convolution_param { num_output: 62 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv2_2_H" type: "Convolution" bottom: "conv2_2_V" top: "conv2_2_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 62 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv2_2_P" type: "Convolution" bottom: "conv2_2_H" top: "conv2_2_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 120 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2_2" type: "ReLU" bottom: "conv2_2_P" top: "conv2_2_P" } layer { name: "pool2" type: "Pooling" bottom: "conv2_2_P" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv3_1_V" type: "Convolution" bottom: "pool2" top: "conv3_1_V" convolution_param { num_output: 110 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv3_1_H" type: "Convolution" bottom: "conv3_1_V" top: "conv3_1_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 110 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv3_1_P" type: "Convolution" bottom: "conv3_1_H" top: "conv3_1_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 228 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu3_1" type: "ReLU" bottom: "conv3_1_P" top: "conv3_1_P" } layer { name: "conv3_2_V" type: "Convolution" bottom: "conv3_1_P" top: "conv3_2_V" convolution_param { num_output: 118 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv3_2_H" type: "Convolution" bottom: "conv3_2_V" top: "conv3_2_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 118 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv3_2_P" type: "Convolution" bottom: "conv3_2_H" top: "conv3_2_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 224 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu3_2" type: "ReLU" bottom: "conv3_2_P" top: "conv3_2_P" } layer { name: "conv3_3_V" type: "Convolution" bottom: "conv3_2_P" top: "conv3_3_V" convolution_param { num_output: 141 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv3_3_H" type: "Convolution" bottom: "conv3_3_V" top: "conv3_3_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 141 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv3_3_P" type: "Convolution" bottom: "conv3_3_H" top: "conv3_3_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 256 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu3_3" type: "ReLU" bottom: "conv3_3_P" top: "conv3_3_P" } layer { name: "pool3" type: "Pooling" bottom: "conv3_3_P" top: "pool3" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv4_1_V" type: "Convolution" bottom: "pool3" top: "conv4_1_V" convolution_param { num_output: 233 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv4_1_H" type: "Convolution" bottom: "conv4_1_V" top: "conv4_1_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 233 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv4_1_P" type: "Convolution" bottom: "conv4_1_H" top: "conv4_1_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 468 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu4_1" type: "ReLU" bottom: "conv4_1_P" top: "conv4_1_P" } layer { name: "conv4_2_V" type: "Convolution" bottom: "conv4_1_P" top: "conv4_2_V" convolution_param { num_output: 256 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv4_2_H" type: "Convolution" bottom: "conv4_2_V" top: "conv4_2_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 256 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv4_2_P" type: "Convolution" bottom: "conv4_2_H" top: "conv4_2_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 484 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu4_2" type: "ReLU" bottom: "conv4_2_P" top: "conv4_2_P" } layer { name: "conv4_3_V" type: "Convolution" bottom: "conv4_2_P" top: "conv4_3_V" convolution_param { num_output: 302 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv4_3_H" type: "Convolution" bottom: "conv4_3_V" top: "conv4_3_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 302 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "conv4_3_P" type: "Convolution" bottom: "conv4_3_H" top: "conv4_3_P" param { lr_mult: 1.0 decay_mult: 1.0 } param { lr_mult: 1.0 decay_mult: 2.0 } convolution_param { num_output: 512 pad: 0 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu4_3" type: "ReLU" bottom: "conv4_3_P" top: "conv4_3_P" } layer { name: "pool4" type: "Pooling" bottom: "conv4_3_P" top: "pool4" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv5_1_V" type: "Convolution" bottom: "pool4" top: "conv5_1_V" convolution_param { num_output: 398 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv5_1_H" type: "Convolution" bottom: "conv5_1_V" top: "conv5_1_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 512 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "relu5_1" type: "ReLU" bottom: "conv5_1_H" top: "conv5_1_H" } layer { name: "conv5_2_V" type: "Convolution" bottom: "conv5_1_H" top: "conv5_2_V" convolution_param { num_output: 390 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv5_2_H" type: "Convolution" bottom: "conv5_2_V" top: "conv5_2_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 512 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "relu5_2" type: "ReLU" bottom: "conv5_2_H" top: "conv5_2_H" } layer { name: "conv5_3_V" type: "Convolution" bottom: "conv5_2_H" top: "conv5_3_V" convolution_param { num_output: 379 pad: 1 pad: 0 kernel_size: 3 kernel_size: 1 stride: 1 } } layer { name: "conv5_3_H" type: "Convolution" bottom: "conv5_3_V" top: "conv5_3_H" param { lr_mult: 1.0 decay_mult: 1.0 } convolution_param { num_output: 512 bias_term: true pad: 0 pad: 1 kernel_size: 1 kernel_size: 3 stride: 1 weight_filler { type: "msra" } } } layer { name: "relu5_3" type: "ReLU" bottom: "conv5_3_H" top: "conv5_3_H" } layer { name: "pool5" type: "Pooling" bottom: "conv5_3_H" top: "pool5" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" inner_product_param { num_output: 4096 } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" inner_product_param { num_output: 4096 } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc8" type: "InnerProduct" bottom: "fc7" top: "fc8" inner_product_param { num_output: 1000 } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" top: "loss" } layer { name: "accuracy/top1" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy@1" accuracy_param { top_k: 1 } } layer { name: "accuracy/top5" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy@5" accuracy_param { top_k: 5 } }

the ilsvrc12_val_lmdb file exists, and temp/model has no file. I don't know what' s wrong

cicido avatar Dec 04 '17 12:12 cicido

I am not sure this problem may be with the bash_completion.

cicido avatar Dec 04 '17 12:12 cicido

use ./caffe/build/tools/caffe train -solver temp/solver.prototxt -weights temp/3c_vgg.caffemodel -gpu $1 instead of finetune.sh. Does it work?

ethanhe42 avatar Dec 05 '17 19:12 ethanhe42

it works. But I get a error as following: F1206 09:59:24.308379 29489 net.cpp:757] Cannot copy param 0 weights from layer 'conv5_1_H'; shape mismatch. Source param shape is 398 398 1 3 (475212); target param shape is 512 398 1 3 (611328). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer. *** Check failure stack trace: *** @ 0x7f723767ae6d (unknown) @ 0x7f723767cced (unknown) @ 0x7f723767aa5c (unknown) @ 0x7f723767d63e (unknown) @ 0x7f723e7ce739 caffe::Net<>::CopyTrainedLayersFrom() @ 0x7f723e7d50f2 caffe::Net<>::CopyTrainedLayersFromBinaryProto() @ 0x7f723e7d5188 caffe::Net<>::CopyTrainedLayersFrom() @ 0x40c405 CopyLayers() @ 0x40ce6d train() @ 0x409e0c main @ 0x7f7225bb7b15 __libc_start_main @ 0x40a7b1 (unknown) Aborted (core dumped)

cicido avatar Dec 06 '17 02:12 cicido

I note that code I have clone may be too old when I open a new issue able rankdic. the code in my machine is : for i in rankdic: if 'conv5' in i: break rankdic[i] = int(rankdic[i] * 4. / speed_ratio) I not the new code is : the code in my machine is : for i in rankdic: if 'conv5' in i: continue rankdic[i] = int(rankdic[i] * 4. / speed_ratio) since python dict has no order, this my produce unmatched shape error. I will try later

cicido avatar Dec 06 '17 02:12 cicido

@cicido Have you solved this problem? I now encountered the same mistakes with you, if you solve,can you tell me how to solver this problem, thanks.

fgxfxpfzzfcc avatar Dec 13 '17 02:12 fgxfxpfzzfcc

sorry for relay later. I have not solved this problem. but from the code: if dcfgs.dic.vh and (conv in alldic or conv in pooldic) and (convnext in self.convs)

and the following code: pooldic = ['conv1_2', 'conv2_2']#, 'conv3_3']

in PDB result , the alldic is: (Pdb) p alldic ['conv1_1', 'conv2_1', 'conv3_1', 'conv4_1', 'conv3_2', 'conv4_2']

(Pdb) p list(zip(convs[1:], convs[2:]+['pool5'])) [('conv1_2', 'conv2_1'), ('conv2_1', 'conv2_2'), ('conv2_2', 'conv3_1'), ('conv3_1', 'conv3_2'), ('conv3_2', 'conv3_3'), ('conv3_3', 'conv4_1'), ('conv4_1', 'conv4_2'), ('conv4_2', 'conv4_3'), ('conv4_3', 'conv5_1'), ('conv5_1', 'conv5_2'), ('conv5_2', 'conv5_3'), ('conv5_3', 'pool5')]

so when in ('conv4_3', 'conv5_1') pair, there is no channeling pruning, only do VH_decompose andITQ_decompose. I wander there may be something wrong with the monment of no chaneling pruning.

cicido avatar Dec 15 '17 06:12 cicido

(64, 64) conv1_1 (64, [22,22,55]) conv1_2 (128,[49,49,117]) conv2_1 (128,[62,62,119]) conv2_2 (256,[110,110,230]) conv3_1 (256,[118,118,223]) conv3_2 (256,[141,141,256]) conv3_3 (512,[233,233,489]) conv4_1 (512,[256,256,470]) conv4_2 (512,[302,302,512]) conv4_3 (512,[398,398,512]) conv5_1 (512,[390,390,512]) conv5_2 (512,[379,379,512]) conv5_3

cicido avatar Dec 15 '17 06:12 cicido

name: "conv1_1" num_output: 64 name: "conv1_2_V" num_output: 22 name: "conv1_2_H" num_output: 22 name: "conv1_2_P" num_output: 55 name: "conv2_1_V" num_output: 49 name: "conv2_1_H" num_output: 49 name: "conv2_1_P" num_output: 117 name: "conv2_2_V" num_output: 62 name: "conv2_2_H" num_output: 62 name: "conv2_2_P" num_output: 119 name: "conv3_1_V" num_output: 110 name: "conv3_1_H" num_output: 110 name: "conv3_1_P" num_output: 230 name: "conv3_2_V" num_output: 118 name: "conv3_2_H" num_output: 118 name: "conv3_2_P" num_output: 223 name: "conv3_3_V" num_output: 141 name: "conv3_3_H" num_output: 141 name: "conv3_3_P" num_output: 256 name: "conv4_1_V" num_output: 233 name: "conv4_1_H" num_output: 233 name: "conv4_1_P" num_output: 489 name: "conv4_2_V" num_output: 256 name: "conv4_2_H" num_output: 256 name: "conv4_2_P" num_output: 470 name: "conv4_3_V" num_output: 302 name: "conv4_3_H" num_output: 302 name: "conv4_3_P" num_output: 512 name: "conv5_1_V" num_output: 398 name: "conv5_1_H" num_output: 398 name: "conv5_1_P" num_output: 512 name: "conv5_2_V" num_output: 390 name: "conv5_2_H" num_output: 390 name: "conv5_2_P" num_output: 512 name: "conv5_3_V" num_output: 379 name: "conv5_3_H" num_output: 379 name: "conv5_3_P" num_output: 512

cicido avatar Dec 15 '17 06:12 cicido

should the parameter of layer conv*_V and conv*_H have the same num_output?

cicido avatar Dec 15 '17 06:12 cicido

@cicido @yihui-he hi after run run ./combine.sh | xargs ./calflop.sh,I get the temp/cb_3c_3C4x_mem_bn_vgg.prototxt and cb_3c_vgg.caffemodel..so i run ./caffe/build/tools/caffe train -solver temp/solver.prototxt -weights temp/cb_3c_vgg.caffemodel -gpu $0 ..but i meet the same problem.. Cannot copy param 0 weights from layer 'conv5_1_H'; shape mismatch. Source param shape is 398 398 1 3 (475212); target param shape is 512 398 1 3 (611328). have you finish it??

zhaobaozi avatar Jan 04 '18 02:01 zhaobaozi

I downloaded my code and ran again. However, I failed to reproduce these problems.

ethanhe42 avatar Jan 04 '18 09:01 ethanhe42

@cicido @yihui-he @zhaobaozi would you mind sharing your ImageNet data, like ,examples/imagenet/ilsvrc12_val_lmdb". I'm really instersted in this good work. but I have not the ImageNet dataset. Thank.

Ai-is-light avatar Mar 20 '18 07:03 Ai-is-light

@yihui-he hi, I would like to know whether the zero weights after finetuning update or still maintain zero value. Hope to get your reply.

jiaqun123 avatar May 14 '18 13:05 jiaqun123

@Ai-is-light I can give you

JingliangGao avatar Jun 15 '18 09:06 JingliangGao

I Input the command "./combine.sh | xargs ./calflop.sh", but meet this problem .

F0615 16:06:04.220142 25028 data_transformer.cpp:168] Check failed: height <= datum_height (224 vs. 107)

Please tell me how to solve the problem.

JingliangGao avatar Jun 15 '18 09:06 JingliangGao

@yihui-he

JingliangGao avatar Jun 15 '18 09:06 JingliangGao

@cicido @fgxfxpfzzfcc @zhaobaozi @Ai-is-light

JingliangGao avatar Jun 15 '18 09:06 JingliangGao

I am running python3 train.py -action c3 -caffe [GPU1],encountered this problem:*** Check failure stack trace: *** How did you solve it?thank you very much. @cicido

sunny5555 avatar Aug 01 '18 08:08 sunny5555