caffe-tensorflow
caffe-tensorflow copied to clipboard
Problems for convert
Hi, I tried to convert the SqueezeNet weights trained to tensorflow. But i get some errors:
ValueError: Unable to determine kernel parameter!
How to fix that? Thank you very much!
I'm also seeing this issue when trying to convert some models and haven't figured out the problem, yet.
I think in the case of SqueezeNet (where I also got that error), the problem is in the global average pooling layer here:
https://gist.github.com/bmount/62089f03e998dc945d9fdb76d8cf82cd#file-squeeze_1-1-prototxt-L652
Does TensorFlow have a generic global pool op? Otherwise, for the converter, the pool kernel dimensions could be set by the output of the previous layer, ie enough to cover the whole input exactly once.
TensorFlow has a Global Average Pooling layer I'm looking at this issue, hope it can be fixed
So did anyone succed in converting squeeze net?
@arsakes yeah, you can just change the last pooling parameter from global to a fixed kernel with size equal to the width/height of the previous layer (iirc it's 14 or 7). I started to make a patch that computes this but got sidetracked, sounds like tf will support it natively so maybe it will be fairly straightforward per the note from @tmatas . But quick fix is to just make that small adjustment to the Caffe model spec.
@bmount Yeah, I made exactly the same to convert squeezenet
This is how I converted the SqueezeNet model:
I changed LayerAdapter class in layers.py to be:
class LayerAdapter(object):
def __init__(self, layer, kind):
self.layer = layer
self.kind = kind
self._input_shape = None
@property
def parameters(self):
name = NodeDispatch.get_handler_name(self.kind)
name = '_'.join((name, 'param'))
try:
return getattr(self.layer, name)
except AttributeError:
raise NodeDispatchError('Caffe parameters not found for layer kind: %s' % (self.kind))
@staticmethod
def get_kernel_value(scalar, repeated, idx, default=None):
if scalar:
return scalar
if repeated:
if isinstance(repeated, numbers.Number):
return repeated
if len(repeated) == 1:
# Same value applies to all spatial dimensions
return int(repeated[0])
assert idx < len(repeated)
# Extract the value for the given spatial dimension
return repeated[idx]
if default is None:
raise ValueError('Unable to determine kernel parameter!')
return default
def set_input_shape(self, input_shape):
self._input_shape = input_shape
@property
def kernel_parameters(self):
assert self.kind in (NodeKind.Convolution, NodeKind.Pooling)
params = self.parameters
global_pool = hasattr(params, 'global_pooling')
if params.kernel_size:
k_h = self.get_kernel_value(params.kernel_h, params.kernel_size, 0)
k_w = self.get_kernel_value(params.kernel_w, params.kernel_size, 1)
elif self._input_size:
k_h, k_w = [self._input_shape.height, self._input_shape.width]
else: #errors out in get_kernel_value function
k_h = self.get_kernel_value(params.kernel_h, params.kernel_size, 0)
k_w = self.get_kernel_value(params.kernel_w, params.kernel_size, 1)
s_h = self.get_kernel_value(params.stride_h, params.stride, 0, default=1)
s_w = self.get_kernel_value(params.stride_w, params.stride, 1, default=1)
p_h = self.get_kernel_value(params.pad_h, params.pad, 0, default=0)
p_w = self.get_kernel_value(params.pad_h, params.pad, 1, default=0)
print self.kind
print self.layer.name
print k_h, k_w, s_h, s_w, p_h, p_w
return KernelParameters(k_h, k_w, s_h, s_w, p_h, p_w)
KernelParameters = namedtuple('KernelParameters', ['kernel_h', 'kernel_w', 'stride_h', 'stride_w',
'pad_h', 'pad_w'])
Note: _input_shape
that has been added to this class to automatically make the kernel for global pooling layers the same size as the input. So, before calling kernel_parameters()
on a node.layer
, make sure to set_input_shape
. For example:
input_shape = node.get_only_parent().output_shape
node.layer.set_input_shape(input_shape)
kernel_params = node.layer.kernel_parameters
The only thing I needed to change in the caffe model spec for squeezenet was for conv10, the kernel size was set to 1 and pad to 1 as well. I removed the pad from that layer since that's not needed.
could you post the squeezenet model for tensorflow?
Trying to convert squeezenet1.1 :
"Multiple top nodes are not supported"
EDIT: Ok, this happens only for train/test version of squeeze 1.1. To use converter you need to :
- Use prototext file with deployment model definition.
- Convert last pooling into fixed size filter (13x13, stride 1) (so edit your deployment model definiton).
@Arsakes I have converted the caffe prototot to a deployment file, but converting the last pooling layer into a convolution doesn’t work. I get an error:
Check failed: target_blobs.size() == source_layer.blovs_size() (2 vs. 0) Incompatible number of blobs for layer pool10.
Could you share you're deployment prototxt?
EDIT: It seems the convolution needs to be a new name.
Hi @Arsakes,
using your deploy.txt, I run this command: python convert.py SqueezeNet/SqueezeNet_v1.1/deploy.txt --caffemodel=SqueezeNet/SqueezeNet_v1.1/squeezenet_v1.1.caffemodel --data-output-path=squeeze.npy
I got the following error: F0106 10:49:53.558841 32003 pooling_layer.cpp:19] Check failed: !(pool_param.has_kernel_size() || pool_param.has_kernel_h() || pool_param.has_kernel_w()) With Global_pooling: true Filter size cannot specified *** Check failure stack trace: *** Aborted (core dumped)
For anyone still interested in this issue, note the right explicit window size and stride are:
https://github.com/dividiti/ck-tensorrt/tree/master/package/caffemodel-deepscale-squeezenet-1.0-explicit-window-global-pooling
> kernel_size: 15
> stride: 15
https://github.com/dividiti/ck-tensorrt/tree/master/package/caffemodel-deepscale-squeezenet-1.1-explicit-window-global-pooling
> kernel_size: 14
> stride: 14
(at least, for working correctly with NVIDIA's TensorRT 1.0.0).
In response to @shrutisharmavsco great answer, make the following change in shapes.py
def get_strided_kernel_output_shape(node, round_func):
assert node.layer is not None
input_shape = node.get_only_parent().output_shape
node.layer.set_input_shape(input_shape)
o_h, o_w = get_filter_output_shape(input_shape.height, input_shape.width,
node.layer.kernel_parameters, round_func)
params = node.layer.parameters
has_c_o = hasattr(params, 'num_output')
c = params.num_output if has_c_o else input_shape.channels
return TensorShape(input_shape.batch_size, c, o_h, o_w)
Maybe, there is something wrong in @shrutisharmavsco code. I think
elif self._input_size:
should be
elif self._input_shape:
Thanks all.... this worked for me. I've collected the above comments into a PR at https://github.com/ethereon/caffe-tensorflow/pull/123