caffe-tensorflow
caffe-tensorflow copied to clipboard
Copying weights to another model is not giving correct results
Hi,
First of all, my apologies if this issue already exists, I could not find the answer I was looking for.
I have transformed the caffemodel shown here, specifically these weights and model, to a .npy file. I want to load these weights into a CNN-M-2048 network in Tensorflow. However, I could not get it to work correctly to classify the original inputs that in Caffe work fine.
I am trying to load the weights in both Keras with Theano backend (just to test with other framework) and Tensorflow. I can open the file and see each layer's weights and biases without any problem. The procedure is the following:
Copying weights in Keras
# Create the network
model = Sequential()
model.add(Convolution2D(96, 7, 7, subsample=(2,2), bias=True, border_mode='valid', activation='relu', name='conv1', input_shape=(20,224,224)))
model.add(LRN2D(n=5, alpha=0.00010000000475, beta=0.75, k=2))
model.add(MaxPooling2D((3,3), strides=(2,2), border_mode='valid'))
model.add(ZeroPadding2D(padding=(1,1)))
model.add(Convolution2D(256, 5, 5, subsample=(2,2), border_mode='same', bias=True, activation='relu', name='conv2'))
model.add(LRN2D(n=5, alpha=0.00010000000475, beta=0.75, k=2))
model.add(MaxPooling2D((3,3), strides=(2,2)))
...
# Copy the weights
for key in data.keys():
w, b = data[key]['weights'], data[key]['biases']
model.get_layer(name=key).set_weights((w,b))
Copying weights in Tensorflow
# Create the variables and copy the weights
conv1_weights = tf.get_variable('conv1_weights', initializer=data['conv1']['weights'])
conv1_biases = tf.get_variable('conv1_biases', initializer=data['conv1']['biases'])
conv2_weights = tf.get_variable('conv2_weights', initializer=data['conv2']['weights'])
conv2_biases = tf.get_variable('conv2_biases', initializer=data['conv2']['biases'])
...
#Create the network
conv_kernel_1 = tf.nn.conv2d(x, conv1_weights, strides=[1,2,2,1], padding='VALID', name='conv1')
bias_layer_1 = relu(tf.nn.bias_add(conv_kernel_1, conv1_biases))
normalized_layer_1 = lrn(bias_layer_1, 'norm1')
pooled_layer1 = tf.nn.max_pool(normalized_layer_1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID', name='pool1')
padded_layer_2 = tf.pad(pooled_layer1, [[0,0], [1,1], [1,1], [0,0]], "CONSTANT")
conv_kernel_2 = tf.nn.conv2d(padded_layer_2, conv2_weights, strides=[1,2,2,1], padding='SAME', name='conv2')
bias_layer_2 = relu(tf.nn.bias_add(conv_kernel_2, conv2_biases))
normalized_layer_2 = lrn(bias_layer_2, name='norm2')
pooled_layer2 = tf.nn.max_pool(normalized_layer_2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID', name='pool2')
Is my procedure incorrect or the weights and biases need some kind of transformation after being transformed with the caffe-tensorflow tool? Thanks in advance.
Same problem with you... when I converted one model from keras to tensorflow, it worked, but when it comes to another two models(one keras, the other mxnet), it fails...
Hi,
My problem was the Local Response Normalization layer , I took it from Tensorflow and added as Lambda layer and everything worked correctly.
same problem