L-OBS
L-OBS copied to clipboard
Some questions about re-implementing L-OBS on Lenet5
Hi,
Thank you very much for sharing your code.
I am trying to re-implementing your L-OBS algorithm for the purpose of learning. I have successfully used this algorithm in a fully connected neural network. However, I encountered some problems when applying this algorithm to cnn (lenet-5 here). Thus, I try to find some solutions in your code.
I found that the lenet-5 model in your code (in the dev branch) is different from the general one. I would like to ask, is this the model used for pruning in the paper, or is it just an example. And, if it just an example, could you please give me some help in implementing the L-OBS algorithm on lenet-5 (mainly feature map special combination problem).
Hope to get your help. I look forward to hearing from you soon.
Best wishes, Hui
Hi @Hui-Ouyang16 ,
I didn't remember the excat architecture of lenet5 in the paper. But I think you can change the lenet5 without affecting the functionality.
How can I help you in the feature map special combination problem?
Best regards, Shangyu
Very surprised that you responded so quickly.
In second convolutional layer of the original model of lenet-5, the in-channel is 6 and the out-channel is 16. It get 16 out-channel by special combination of the 6 in-channel. My question is, in your experiments, how to deal with this combination problem and pruning. Can I get some details on your implementation. If you can provide the experimental code in the paper as a reference, it would be greatly appreciated.
Thank you very much for your quick reply.
Best wishes, Hui
Hi @Hui-Ouyang16 ,
The combination you mention is how CNN works? if you are refering the image patch extraction and its operation with kernel, you may check https://github.com/csyhhu/L-OBS/blob/dev/PyTorch/utils/hessian_utils.py#L70 here for implementation details.
To summary, I use a function in tensorflow to extract image patch (same shape with the kernel), which is used to calculate the hessian.
Best regards, Shangyu
Hi @csyhhu ,
Thanks for your reply, but I still have some confusions. To specify my question, I found the original model of lenet-5 from the paper "GradientBased Learning Applied to Document Recognition" as the following figure shown.
In this model, the convolution layer from S2 to C3. The author use the following combination to get 16 feature map from 6 feature map. Thus, what I wanna know is how to deal with these combinations.
Best wishes, Hui
Hi @Hui-Ouyang16 ,
Thanks for your detailed explanation. It seems that we don't use such combination in the implementation, nor did TensorFlow/ PyTorch use in their framework.
In current TensorFlow/ PyTorch framework, all features in the input is used to conduct convolution with kernels. Since kernel and input feature share a same channel / feature maps number, it is not necessary to perform such conbination (or "choice" if I understand correctly) of feature maps.
It seems that there is some difference between the tradition lenet-5 and current implementation of CNN.
Best regards, Shangyu
Hi @csyhhu ,
Thank you so much for your patience. This solved my confusion.
Best wishes, Hui