tensorflow-onnx icon indicating copy to clipboard operation
tensorflow-onnx copied to clipboard

ValueError: tensorflow op ExtractImagePatches is not supported

Open ayermolo opened this issue 6 years ago • 30 comments

I am trying to convert yolov2-voc generated by DarkFlow in to ONNX, and running in to this error "ValueError: tensorflow op ExtractImagePatches is not supported"

I am using tensorflow=1.9.0, onnx=1.4.1, opset=7, tfonnx=1.5.0/9fcb06.

Is this tensorflow issue, or the converter? How would I got about fixing this?

Thank You

ayermolo avatar Apr 02 '19 18:04 ayermolo

currently, for TF ExtractImagePatches op, there is no equivalent op in onnx. would need to propose and add such op into future opset.

nbcsm avatar Apr 02 '19 18:04 nbcsm

@nbcsm Thank you for reply. What would be the process for that?

ayermolo avatar Apr 02 '19 18:04 ayermolo

go to https://github.com/onnx/onnx, and open an issue for ExtractImagePatches

nbcsm avatar Apr 02 '19 18:04 nbcsm

Great, thanks

ayermolo avatar Apr 02 '19 18:04 ayermolo

@nbcsm can it be composed in the converter?

prasanthpul avatar Apr 03 '19 17:04 prasanthpul

Along the same lines. Translated in to something else in special cases: I think ExtractImagePatch with ksizes = [1, blk, blk, 1], stride = [1, blk, blk, 1], rates = [1, 1, 1, 1] is equivalent to SpaceToDepth with blocksize = blk, blk.

ayermolo avatar Apr 03 '19 17:04 ayermolo

for that case (ExtractImagePatch with ksizes = [1, blk, blk, 1], stride = [1, blk, blk, 1], rates = [1, 1, 1, 1]), yes, SpaceToDepth seems to work. but it may not work for other ExtractImagePatch cases.

it feels like we may use MaxPool + SpaceToDepth to convert ExtractImagePatch, but need to double check.

nbcsm avatar Apr 03 '19 20:04 nbcsm

Hi, I'm having the same issue here when trying to convert YOLOv2 full network (416x416) into ONNX and I really need ExtractImagePatch op, I was trying to use tensorflow model instead but it doesnt worked, as ML.NET failed to load it. Please add support for this or at least some workaround code?

How do I define own special op for that?

xtremertx avatar Aug 05 '19 13:08 xtremertx

@nbcsm Hi can someone just paste here the ExtractImagePatch handler that is using SpaceToDepth ? I'm new in this area and writing own operations seems quite hard to me, I need some assistence, thanks.

xtremertx avatar Aug 05 '19 14:08 xtremertx

Are you using DarkFlow? It's pretty straightforward: https://github.com/ayermolo/darkflow/commit/c62d503493d2db6f3d94d106a60b2726a4eda47a

ayermolo avatar Aug 05 '19 18:08 ayermolo

Yes, I have used DarkFlow and converted darknet (.weights) model into tensorflow (.pb), because only tensorflow and ONNX are supported formats in ML.NET which I'm using. However tensorflow model throwed error while loading it using ML.NET (I have tested that model using darkflow and detections were working there, will report that issue to MS). So logical step was to try to convert my model into ONNX that is also supported.

Gonna to try your commit, thanks a lot.

xtremertx avatar Aug 05 '19 20:08 xtremertx

Hi, any update on this ? I am trying to convert a model using this operator in Tensor flow without success. Thanks for any assistance !

jhagege avatar Oct 26 '20 18:10 jhagege

#1543 features another model using this op. ksizes = [1, 3, 3, 1], rates = [1, 1, 1, 1], strides = [1, 1, 1, 1], padding = 'SAME'/'VALID'

TomWildenhain-Microsoft avatar Jun 18 '21 19:06 TomWildenhain-Microsoft

I have tried to make a numpy implementation of ExtractImagePatches:

def numpy_eip(arr, ksizes, strides, rates, padding):
	sizes = [1, ksizes[1]*rates[1] - (rates[1]-1), ksizes[2]*rates[2] - (rates[2]-1), 1]
	
	if padding == 'SAME':
		extra_i = max(0, (arr.shape[1]-1) // strides[1] * strides[1] + sizes[1] - arr.shape[1])
		extra_j = max(0, (arr.shape[2]-1) // strides[2] * strides[2] + sizes[2] - arr.shape[2])
		arr = np.pad(arr, [(0,0), (extra_i//2, extra_i//2 + extra_i%2), (extra_j//2, extra_j//2 + extra_j%2), (0,0)])
	elif padding != 'VALID':
		raise Exception('Padding type "%s" is not supported' % padding)
	
	def make_range(in_size, k_size, rate, stride):
		return range(0, in_size - (k_size*rate - rate), stride)
	indexes_i = make_range(arr.shape[1], ksizes[1], rates[1], strides[1])
	indexes_j = make_range(arr.shape[2], ksizes[2], rates[2], strides[2])

	batch_size = arr.shape[0]
	channel_size = ksizes[1]*ksizes[2]*arr.shape[3]

	return np.concatenate([np.concatenate([
		arr[:, i : sizes[1]+i : rates[1], j : sizes[2]+j : rates[2], :].reshape([batch_size, 1, 1, channel_size])
			for j in indexes_j], axis=2)
				for i in indexes_i], axis=1)

Note that the TF documentation says that ksizes, strides, and rates must be on the form [1, a, b, 1]. ExtractImagePatches can probably be seen as a generalized version of SpaceToDepth where the blocks/tiles may overlap or have gaps between them. If the dimensions of the input array is known, I believe it should be possible to map this to a series of slices, reshape/flatten, and concatenations. (The last 4 lines.) In my case we are talking about 32x32/30x30 of those which is a bit excessive. Any ideas on how to do this differently? I do not understand what the MaxPool idea is about.

spillerrec avatar Jun 19 '21 23:06 spillerrec

If we want to do an onnx op, could we do a conv with an identity kernel?

TomWildenhain-Microsoft avatar Jun 20 '21 02:06 TomWildenhain-Microsoft

I have done a quick test with my ksizes = [1, 3, 3, 1], rates = [1, 1, 1, 1], strides = [1, 1, 1, 1], padding = 'VALID' case and it appears to work: eip_valid.zip (I have yet to test it in the full model, I will take a few days off this so I will check that later this week.) The main issue is that this is 20MB uncompressed which is a bit of a shame. But other than that this could be a quite good mapping if the rest of the stride/rate fits the Conv behavior AND you have a low ksize and channel dimension. This gets more and more bloated as ksize increases.

spillerrec avatar Jun 21 '21 10:06 spillerrec

I think you should be able to move the M and C values of 2304 and 256 to the batch dimension, do the conv with a simple [1, 1, 3, 3] kernel and put the batch dims back. That should make the kernel much smaller.

TomWildenhain-Microsoft avatar Jun 21 '21 17:06 TomWildenhain-Microsoft

Minimal effort test without batch handling, seems to work, and down to 1 KiB with a [9, 1, 3, 3] kernel: eip_valid.zip This is using a reshape however which I think would be nice to avoid, as it requires you to specify a size. I like the concept of dynamic axes at least.

spillerrec avatar Jun 21 '21 21:06 spillerrec

Perfect. Actually if reshape sees a -1 in the shape it will dump all the remaining dimensions into there, and a 0 indicates to keep that dim the same. Since the input is [9, 256, 30, 30], try [1, -1, 0, 0] instead of [1, 2304, 30, 30]. It is also common to use Shape, Slice, Concat, and Reshape together to make the shapes dynamic. Should be cheap to execute since all those ops are small.

TomWildenhain-Microsoft avatar Jun 21 '21 22:06 TomWildenhain-Microsoft

That is really useful, I could fix the rest of the Reshapes in my model using that as well. I have not yet looked at how this Conv approach works for other cases, but I have converted my model and I'm quite happy with the result. It works perfectly and allows me to have dynamic input sizes as well (after fixing several other issues with Reshape, Resize, and ConvTranspose). For the parameters:

  • ksizes/sizes: Works, though it might not be a good fit with with large values, such as 32x32. I don't know if such usage is common.
  • stride: Should map perfectly to the stride in the Conv op I believe
  • rate: This adds a stride to the ksize x ksize inputs. I don't know if this is what dilations is for Conv, but otherwise I'm not sure what could be done for this. ("dilation value along each spatial axis of the filter" doesn't really explain anything.)
  • padding: VALID appears to match Conv's behavior with ksize and stride so nothing needs to be done here. SAME adds zero padding in a way that is dependent on the input size and can easily be done with just adding padding of the appropriate size. If we are lucky the auto_pad attribute of Conv behaves the same, but otherwise it just means calculating the appropriate padding for that specific input size.

Assuming we have a way to convert ExtractImagePatches to either Conv or SpaceToDepth in either all cases or in specific cases, what can/needs to be done to move this forward and what can I help with? Could this be done by the converter (the Conv approach technically is a change of semantics even though the behavior is the same), or could we make a guide on what options you have to deal with these models instead (i.e. custom OP implementation, replacing it with common OPs, etc.). Is there still a need for pushing ExtractImagePatches to get standalized assuming we have a workaround with standard operators?

spillerrec avatar Jun 27 '21 22:06 spillerrec

@spillerrec it would be fantastic if you could add support for this op in tf2onnx. Add a handler for ExtractImagePatches in nn.py https://github.com/onnx/tensorflow-onnx/blob/master/tf2onnx/onnx_opset/nn.py You can copy CTCGreedyDecoder as a template. Use the lowest onnx opset you can get away with https://github.com/onnx/onnx/blob/master/docs/Operators.md. Then add some unit tests to test_backend.py https://github.com/onnx/tensorflow-onnx/blob/master/tests/test_backend.py

It is perfectly fine to only be able to convert certain cases. Use utils.make_sure(<condition>, "explanation") to require a certain condition for conversion.

It's fine that this conversion uses a composition of ops and might have different performance characteristics from the original TF op. That's pretty common in the converter. Since this is a pre-processing op, hopefully the inference time won't be dominated by it. For large sizes hopefully we can use SpaceToDepth instead, since I think those cases have strides matching the size (otherwise the output would be huge anyway).

As you are working, open a PR and I'll be happy to comment if you run into any issues.

TomWildenhain-Microsoft avatar Jun 28 '21 18:06 TomWildenhain-Microsoft

@TomWildenhain-Microsoft Thanks for the pointers, I will give it a go.

spillerrec avatar Jun 30 '21 20:06 spillerrec

@spillerrec Just checking if you've had a chance to work on this. Happy to help if you are stuck.

TomWildenhain-Microsoft avatar Jul 15 '21 20:07 TomWildenhain-Microsoft

@TomWildenhain-Microsoft I'm sorry about the lack of updates, no, I have not looked at it yet. However I have next week off, so I will not have an excuse now for not getting some progress done on it. ; )

spillerrec avatar Jul 16 '21 23:07 spillerrec

@spillerrec Are there any updates on this issue?

savvaki avatar Dec 09 '21 06:12 savvaki

@savvaki No, not anything else than I looked into the dilations parameter and that it should be possible to use that to implement rate.

To sum up, the current status is that we have a decent workaround using standard operators and most people would probably be happy with it. It is currently stalled on me to actually add it.

I honestly forgot about this issue, so thanks for the ping. I found out that mingw can't be used with onnxruntime on Windows (and I have no intention to use Visual Studio), so this stalled my own ONNX related project (and will probably stay so until I switch back to Linux). I'm still willing to look at it, just don't assume it will be anytime soon with my track record and certainly not before January. If anyone else is interested in getting this done, don't hold back because of me.

spillerrec avatar Dec 11 '21 14:12 spillerrec

@spillerrec Do you perhaps have the tf code lying around that you used to generate the onnx models that you posted above?

savvaki avatar Dec 13 '21 07:12 savvaki

I created an "equivalent" operation to tf.image.extract_patches using an identity kernel with a convolution. The code seemingly works for batches, as well as different patch sizes. In addition, multiple input feature maps seem to work. This is an code snippet if anyone is interested. I have not tried different padding variations/strides, meaning that this snippet is not complete for more exotic use cases. Feel free to modify.

import tensorflow as tf
import numpy as np

# Define some parameters
input_shape = [5, 10, 10, 6]
patch_size = 3 # height/width of the square image patch size
padding = 'SAME'

# Generate example input
x_original = tf.cast(tf.reshape(tf.range(np.prod(input_shape)), input_shape), dtype=tf.float32)

x = tf.transpose(x_original, perm=[0, 3, 1, 2]) # Move feature map dimension next to the batch dimension
x = tf.expand_dims(x, -1) # Add extra channel at the end 

# Create an identity kernel
kernel = tf.reshape(tf.eye(patch_size**2), [patch_size, patch_size, 1, patch_size**2]) # [filter_height, filter_width, in_channels, out_channels]

# Convolve with identity kernel
patches_simulation = tf.nn.conv2d(x, kernel, strides=[1, 1, 1, 1], padding='VALID')
patches_simulation = tf.transpose(patches_simulation, perm=[0 ,2 ,3, 4, 1]) # Move filter dim to last
patches_simulation_shape = tf.shape(patches_simulation)
patches_simulation = tf.reshape(patches_simulation, [patches_simulation_shape[0], patches_simulation_shape[1], patches_simulation_shape[2], -1]) # Merge last two dims into one

# Intended output to compare against
patches = tf.image.extract_patches(x_original, sizes=[1, patch_size, patch_size, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1], padding='VALID')

# Print out check to see if the alternative method is identical to tf.image.extract_patches
print(f'tf.image.extract_patches shape {patches.shape} simulation shape {patches_simulation.shape} same shape: {patches.shape == patches_simulation.shape}')
print(f'Simulation is correct? {tf.reduce_all(tf.math.equal(patches, patches_simulation)).numpy()}')

savvaki avatar Dec 14 '21 08:12 savvaki

I am trying to convert a custom developed model with a custom convolution layer and I implemented it with tf.image.extract_patches and running in to this error while Coverting to tflite model. "RuntimeError: Failed to initialize op resolver for calibration: There are unresolved custom ops: [ExtractImagePatches]Encountered unresolved custom op: ExtractImagePatches." I am using tensorflow=2.8 I know that tflite doesn't support tf.image.extract_patches, is there any way to get around it? or custom function for tf.image.extract_patches?

ManishwarG avatar May 15 '22 23:05 ManishwarG

Hi! Any updates? I'm also waiting for this update :(

[UPDATE] I found this implementation that works for me: https://github.com/PINTO0309/openvino2tensorflow/blob/63fff406a2bd44258c91521add6efc8acf899808/openvino2tensorflow/openvino2tensorflow.py#L6821

kobiche avatar Mar 21 '23 16:03 kobiche