nolearn icon indicating copy to clipboard operation
nolearn copied to clipboard

Layer weights change despite trainable flag set to false

Open caleytown opened this issue 9 years ago • 4 comments

I have a network where some of the network layers have the trainable flag set to false. I check the weight values before and after calling "fit" and the weight values change. Even when regularization is set to false, they change. Could this be a bug or am I missing something? Thanks

Jeff

for layerName in ae.layers_:
    if isinstance(ae.layers_[layerName],Conv2DLayerFast):
        ae.layers_[layerName].W = ae.layers_[layerName].add_param(ae.layers_[layerName].W,ae.layers_[layerName].W.container.data.shape, trainable=False)
        ae.layers_[layerName].b = ae.layers_[layerName].add_param(ae.layers_[layerName].b,ae.layers_[layerName].b.container.data.shape, trainable=False)
        #ae.layers_[layerName].params.items()[0][1].remove('trainable')
        #ae.layers_[layerName].params.items()[1][1].remove('trainable')


print get_all_param_values(ae.layers_['conv2d1'])


ae.fit(X, X_out)
print get_all_param_values(ae.layers_['conv2d1'])

caleytown avatar Jan 21 '16 22:01 caleytown

Also, the message that is displayed when when training the network

"Neural Network with 458656 learnable parameters"

Displays the total number of learnable parameters, not the number of parameters that are being learning (prints the same number if trainable=False on some layers)

This is the fix, not sure if intended or bug:

    @staticmethod
    def _get_greeting(nn):
        shapes = [param.get_value().shape for param in
                  nn.get_all_params(trainable=True) if param]
        nparams = reduce(operator.add, [reduce(operator.mul, shape) for
                                        shape in shapes])
        message = ("# Neural Network with {} learnable parameters"
                   "\n".format(nparams))
        return message

caleytown avatar Jan 21 '16 23:01 caleytown

About your first problem, I don't know why the weights change. nolearn should not interfere with that. Have you tried what happens if you use the same layers but train without nolearn?

Regarding the second problem, I guess it depends on your definition of "learnable". I believe the main distinction is "learnable" vs hyper-parameters such as the learning rate. Maybe there should be a second sentence with the number of trainable parameters, as you suggested? You could try to make a pull request and see what @dnouri thinks about it.

BenjaminBossan avatar Jan 23 '16 15:01 BenjaminBossan

I updated the 'greeting' in the way that @caleytown suggested. I think it's what people expect.

dnouri avatar Mar 11 '16 05:03 dnouri

@caleytown If you've initialized your network before, e.g. you've trained it with fit already before your loop, then make sure that you initialize the network again (and thus pass the updated list of parameters this time).

Something like this should do:

# after some training, set some params to be not trainable:
mylayer = ae.layers_['mylayer']
mylayer.params[mylayer.W].remove('trainable')
mylayer.params[mylayer.b].remove('trainable')

# now call intialize to recompile optimizer, then fit:
ae._initialized = False
ae.initialize()

ae.fit(X, X_out)

dnouri avatar Mar 26 '16 02:03 dnouri