DeepLearning
DeepLearning copied to clipboard
About fine-tuning in the source code of Stacked Denoising Autoencoders
I'm using the Java version of the code. Why the current functionality of the finetune
function is used for finetuning of the network? The finetune
function does not update the weights of the entire network (except for the last layer, i.e. log_layer
).
The other versions of the code written in different languages are using the same trick. Anyone can help me understand why this kind of fine-tuning works? Any references?
Thanks.