PocketFlow icon indicating copy to clipboard operation
PocketFlow copied to clipboard

Document issue: Channel pruning(GPU) & self-defined model

Open zheLim opened this issue 6 years ago • 9 comments
trafficstars

After having a glance on channel pruning gpu version, i found that gpu version may not strictly implement lasso regression. Neither Coordinate descent method nor LARS optimization algorithm are used. It will be great if you can add some description about gpu version on algorithm or something else.

BTW, The document of self-defined model is not clear enough.

  • Execution script name: We must name the execution script as networkname_at_datasetname_run.py. Datasetname must be the same as dataset class, or it would not be Identified by utils/get_path_args.py.
  • Checkpoint name: Checkpoint of self-defined model must be named as model-xxxx.ckpt instead of model.ckpt directly. Or channel pruning (cpu version) init() will eval function __build_pruned_evaluate_model() since tf.train.checkpoint_exists(path) will recognize this checkpoint and cause error. See learner/channel_pruning/learner.py line 284. Error msg: eval_logits = tf.get_collection('logits')[0] list index out of range. What's more, the checkpoint file which used for indicating latest checkpoint is required since the code use tf.train.lastest_checkpoint() to find the checkpoint.
  • Checkpoint variable scope: The original checkpoint must have variable scope 'model' else the parameter cannot be restored. See learner/channel_pruning/learner.py line 251.

In all, PocketFlow is a great job and i learn a lot and i am still learning from it :) .

zheLim avatar Nov 27 '18 09:11 zheLim

BTW, CPU version of channel pruning only deals with regular 2D convolution. It cannot process dilation convolution. It would be better if an clarification is add to documents.

zheLim avatar Nov 27 '18 10:11 zheLim

@zheLim Thanks for your suggestions.

  1. The GPU version does not implement lasso regression. Actually, it is solving a L2,1-norm regularized optimization problem with proximal gradient descent. The regularization strength is gradually increasing to slowly lift the pruning ratio to the target value. We will provide an independent documentation to describe its algorithm in details.
  2. We will clarify these details in the "self-defined models" documentation.
  3. This will be clarified in the next PR.

jiaxiang-wu avatar Nov 29 '18 00:11 jiaxiang-wu

Thanks for reply. Does solving a L2,1-norm regularized optimization problem get better result than lasso regression?

zheLim avatar Nov 29 '18 03:11 zheLim

It runs faster under multi-GPU setting, and achieves higher accuracy on some models. We will provide more detailed results in the documentation.

jiaxiang-wu avatar Nov 29 '18 03:11 jiaxiang-wu

Thanks a lot :)

zheLim avatar Nov 29 '18 03:11 zheLim

Doc:

  1. add documentation for ChannelPrunedGpuLearner;
  2. fix minor issues in "self-defined models" and ChannelPrunedLearner.

jiaxiang-wu avatar Nov 29 '18 03:11 jiaxiang-wu

Hey @jiaxiang-wu
How about the accuracy of ChannelPrunerGpuLearner on MobileNet ? Is this a structured pruning algorithm which leads to a regular pattern of sparsity?

GoldenSpark avatar Jan 08 '19 07:01 GoldenSpark

  1. For MobileNet-v1, the top-1 accuracies are: 68.5% (50% FLOPs) | 67.8% (40% FLOPs) | 66.3% (30% FLOPs)
  2. ChannelPrunedGpuLearner is a structured-pruning algorithm. The compressed model has regular sparsity patterns.

jiaxiang-wu avatar Jan 08 '19 07:01 jiaxiang-wu

@jiaxiang-wu Thanks much!

GoldenSpark avatar Jan 08 '19 09:01 GoldenSpark