seya
seya copied to clipboard
How to insert Spatial Transformer layer BEFORE a convnet
I would like to have a Spatial Transformer layer before a pretrained convnet, such as the the Keras ResNet50
. As such I have prepared the following various attempts to connect the SpatialTransformer
's output to the ResNet50
's input, but get errors every time. Furthermore, I'm not entirely sure how these layers are supposed to work with the Keras Functional API.
from seya.layers.attention import SpatialTransformer
import numpy as np
from keras.layers import MaxPooling2D, Convolution2D, Dense, Activation, Flatten, Input, GlobalAveragePooling2D
from keras.applications.resnet50 import ResNet50
from keras.models import Model
from keras.optimizers import Adam
def locnet(input):
# initial weights
b = np.zeros((2, 3), dtype='float32')
b[0, 0] = 1
b[1, 1] = 1
W = np.zeros((50, 6), dtype='float32')
weights = [W, b.flatten()]
# original from https://github.com/EderSantana/seya/blob/master/examples/Spatial%20Transformer%20Networks.ipynb
# locnet = Sequential()
# locnet.add(MaxPooling2D(pool_size=(2, 2), input_shape=input_shape))
# locnet.add(Convolution2D(20, 5, 5))
# locnet.add(MaxPooling2D(pool_size=(2, 2)))
# locnet.add(Convolution2D(20, 5, 5))
#
# locnet.add(Flatten())
# locnet.add(Dense(50))
# locnet.add(Activation('relu'))
# locnet.add(Dense(6, weights=weights))
# # locnet.add(Activation('sigmoid'))
# translated the above to functional API
pool1 = MaxPooling2D()(input)
conv1 = Convolution2D(20, 5, 5)(pool1)
pool2 = MaxPooling2D()(conv1)
conv2 = Convolution2D(20, 5, 5)(pool2)
flatten = Flatten()(conv2)
dense = Dense(50)(flatten)
# dense = Activation('relu')(dense)
params = Dense(6, weights=weights, name='affine_params')(dense)
return params
def spatial_transformer_net(input_shape, num_categories):
'''
plug a spatial transformer network into a Keras resnet50
'''
# make an input tensor
i = Input(input_shape)
# get a locnet
loc = locnet(i)
# get a spatial transformer
st = SpatialTransformer(localization_net=loc, downsample_factor=2, input_shape=input_shape)
# get a pretrained convnet
#######################################################can the SpatialTransformer be plugged in as the input tensor?
base_model = ResNet50(weights='imagenet', include_top=False)(st)
# freeze it
for layer in base_model.layers:
layer.trainable = False
################################################################################## or do we plug it in here somehow?
# base_model.input = st.output
# base_model.input = st
# set output
Z = base_model.get_layer('activation_49').output
Z = GlobalAveragePooling2D()(Z)
Z = Dense(1024, activation='relu')(Z)
Z = Dense(num_categories, activation='softmax')(Z)
# create the Keras functional model
model = Model(input=i, output=Z)
# compile the model
model.compile(optimizer=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0), loss='categorial_crossentropy', metrics=['accuracy'])
return model
I have it working but still not tested (I don't know if the output makes sense), here is my code:
data = Input(shape=(3,227,227), dtype='float32', name='_input')
b = np.zeros((2, 3), dtype='float32')
b[0, 0] = 1
b[1, 1] = 1
W = np.zeros((50, 6), dtype='float32')
weights = [W, b.flatten()]
locnet = Flatten()(data)
locnet = Dense(50)(locnet)
locnet = Activation('relu')(locnet)
locnet = Dense(6, weights=weights)(locnet)
x = SpatialTransformer(localization_net=locnet_model, downsample_factor=1, return_theta=False)(data)
The locnet is just a dummy network.
@AdrianNunez OK fair enough. But how would you go about feeding that x
into the ResNet50's input?
Something like this?
from keras.applications.resnet50 import ResNet50
net_output = ResNet50(weights='imagenet', include_top=False)(x)
But this construction gives me an error:
x = SpatialTransformer(localization_net=locnet, downsample_factor=1, return_theta=False)(data)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 487, in __call__
self.build(input_shapes[0])
File "/usr/local/lib/python2.7/dist-packages/seya/layers/attention.py", line 43, in build
self.locnet.build(input_shape)
AttributeError: 'TensorVariable' object has no attribute 'build'
The function ResNet50 returns a model. You first have to get the model calling the function and then use the model with your own data.
data = Input(...)
...
resnet50 = ResNet50(weights='imagenet', include_top=False, input_tensor=data)(x)
net_output = resnet50.output
x = Flatten()(net_output)
...
or
data = Input(...)
...
resnet50 = ResNet50(weights='imagenet', include_top=False)(x)
net_output = resnet50(data)
x = Flatten()(net_output)
...
In the first case I use the input_tensor parameter to specify the input of the network and then I get the actual input of the network. In the second case I create the model and then I use it with my own data.
Using your first construction like so (with the ...
denoting the locnet
definition):
def my_net(input_shape, num_categories):
data = Input(shape=input_shape, dtype='float32', name='_input')
...
x = SpatialTransformer(localization_net=locnet, downsample_factor=1, return_theta=False)(data)
r50 = ResNet50(weights='imagenet', include_top=False, input_tensor=data)(x)
net_output = r50.output
Z = Flatten()(net_output)
Z = Dense(num_categories, activation='softmax')(Z)
model = Model(input=data, output=Z)
model.compile(optimizer=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0), loss='categorial_crossentropy')
return model
And using the second construction like so:
def my_net(input_shape, num_categories):
data = Input(shape=input_shape, dtype='float32', name='_input')
...
x = SpatialTransformer(localization_net=locnet, downsample_factor=1, return_theta=False)(data)
r50 = ResNet50(weights='imagenet', include_top=False)(x)
net_output = r50(data)
Z = Flatten()(net_output)
Z = Dense(num_categories, activation='softmax')(Z)
model = Model(input=data, output=Z)
model.compile(optimizer=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0), loss='categorial_crossentropy')
return model
I get the error:
File "/home/qwerty/neural_nets/spatial_transformer.py", line 97, in my_net
x = SpatialTransformer(localization_net=locnet, downsample_factor=1, return_theta=False)(data)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 487, in __call__
self.build(input_shapes[0])
File "/usr/local/lib/python2.7/dist-packages/seya/layers/attention.py", line 43, in build
self.locnet.build(input_shape)
AttributeError: 'TensorVariable' object has no attribute 'build'
My seya comes from the keras1
branch, and I use keras 1.1.0
The locnet that you pass to the SpatialTransformer layer should be a model, not a tensor. That means it should not end in something like this:
`locnet = Dense(6, weights=weights)
But:
locnet = Dense(6, weights=weights)
locnet_model = Model(input=data, output=locnet)
x = SpatialTransformer(localization_net=locnet_model, downsample_factor=1, return_theta=False)(data)
...
The "AttributeError: 'TensorVariable' object has no attribute 'build'" error refers to this.
Apart from this, my mistake for copy-pasting, this line is not correct:
r50 = ResNet50(weights='imagenet', include_top=False)(x)
It should be:
r50 = ResNet50(weights='imagenet', include_top=False)
That is, you don't include inputs at the end between parenthesis. In fact, the input is then given in the following line:
net_output = r50(data)
My apologies for this error.
Is it resolved ?