caffe-tensorflow icon indicating copy to clipboard operation
caffe-tensorflow copied to clipboard

Copying weights to another model is not giving correct results

Open AdrianNunez opened this issue 7 years ago • 3 comments

Hi,

First of all, my apologies if this issue already exists, I could not find the answer I was looking for.

I have transformed the caffemodel shown here, specifically these weights and model, to a .npy file. I want to load these weights into a CNN-M-2048 network in Tensorflow. However, I could not get it to work correctly to classify the original inputs that in Caffe work fine.

I am trying to load the weights in both Keras with Theano backend (just to test with other framework) and Tensorflow. I can open the file and see each layer's weights and biases without any problem. The procedure is the following:

Copying weights in Keras

# Create the network
model = Sequential()
        
model.add(Convolution2D(96, 7, 7, subsample=(2,2), bias=True, border_mode='valid', activation='relu', name='conv1', input_shape=(20,224,224)))   
model.add(LRN2D(n=5, alpha=0.00010000000475, beta=0.75, k=2))
model.add(MaxPooling2D((3,3), strides=(2,2), border_mode='valid'))
    
model.add(ZeroPadding2D(padding=(1,1)))
model.add(Convolution2D(256, 5, 5, subsample=(2,2), border_mode='same', bias=True, activation='relu', name='conv2'))
model.add(LRN2D(n=5, alpha=0.00010000000475, beta=0.75, k=2))
model.add(MaxPooling2D((3,3), strides=(2,2)))
...

# Copy the weights
for key in data.keys():
   w, b = data[key]['weights'], data[key]['biases']
   model.get_layer(name=key).set_weights((w,b))

Copying weights in Tensorflow

# Create the variables and copy the weights
conv1_weights = tf.get_variable('conv1_weights', initializer=data['conv1']['weights'])
conv1_biases = tf.get_variable('conv1_biases', initializer=data['conv1']['biases'])
conv2_weights = tf.get_variable('conv2_weights', initializer=data['conv2']['weights'])
conv2_biases = tf.get_variable('conv2_biases', initializer=data['conv2']['biases'])
...

#Create the network
conv_kernel_1 = tf.nn.conv2d(x, conv1_weights, strides=[1,2,2,1], padding='VALID', name='conv1')
bias_layer_1 = relu(tf.nn.bias_add(conv_kernel_1, conv1_biases))
normalized_layer_1 = lrn(bias_layer_1, 'norm1')
pooled_layer1 = tf.nn.max_pool(normalized_layer_1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID', name='pool1')

padded_layer_2 = tf.pad(pooled_layer1, [[0,0], [1,1], [1,1], [0,0]], "CONSTANT")
conv_kernel_2 = tf.nn.conv2d(padded_layer_2, conv2_weights, strides=[1,2,2,1], padding='SAME', name='conv2')
bias_layer_2 = relu(tf.nn.bias_add(conv_kernel_2, conv2_biases))
normalized_layer_2 = lrn(bias_layer_2, name='norm2')
pooled_layer2 = tf.nn.max_pool(normalized_layer_2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID', name='pool2')

Is my procedure incorrect or the weights and biases need some kind of transformation after being transformed with the caffe-tensorflow tool? Thanks in advance.

AdrianNunez avatar Jan 11 '18 16:01 AdrianNunez

Same problem with you... when I converted one model from keras to tensorflow, it worked, but when it comes to another two models(one keras, the other mxnet), it fails...

pkfancy avatar Apr 09 '18 12:04 pkfancy

Hi,

My problem was the Local Response Normalization layer , I took it from Tensorflow and added as Lambda layer and everything worked correctly.

AdrianNunez avatar Apr 12 '18 17:04 AdrianNunez

same problem

joriatyBen avatar Nov 28 '18 10:11 joriatyBen