"Cannot infer the missing size in [-1, 0] when there are 0 elements" when using padding in custom converted TFJS model
Hi! I'm using a custom model for Background Matting, actually tested and working on ONNX Runtime, ONNX Runtime Web and Tensorflow (0.11 inference seconds on GPU). In order to use it I convert the PyTorch custom model to ONNX, then to Tensorflow and then to Tensorflow JS, using the recommended libraries such as onnx-tensorflow and tensorflowjs_converter.
The problem is that this model does not work at all with TensorflowJS and I've found out that the following error: "Cannot infer the missing size in [-1, 0] when there are 0 elements" is caused by the padding function of PyTorch and I just can't understand the reason why.
The problematic line is something like this:
from torch.nn import functional as F
x = F.pad(x, (3, 3, 3, 3))
I've done several tests and I'm sure that the problem with TFJS is caused only by this padding operation. In fact removing that from the model it works but messing up all the dimensions clearly, so it's a function that I need absolutely in my model. I have also another simplified version with less accurate results not using the padding and it works on TFJS.
Is this actually a TFJS bug? Is there a workaround to make padding work? Any help is very appreciated.
hi @FabioRomagnolo It could very possible that during the conversion of PyTorch => Onnx => TF => TFJS something might have lost in translation. Can you help to create a minimum model that includes only the pad function and share all the models artifacts from the conversion pipeline with us? thanks
Thanks for your response. I've converted this simple PyTorch model (don't care about the output names):
from torch import nn
from torch.nn import functional as F
class PadTest(nn.Module):
"""
A simple model executing padding operation on two input images to test the conversion to TFJS.
"""
def __init__(self, padding=None):
super().__init__()
if padding is None:
padding = [3, 3, 3, 3]
self.padding = padding
def forward(self, src, bgr):
padded_src = F.pad(src, self.padding)
padded_bgr = F.pad(bgr, self.padding)
return padded_src, padded_bgr
Converting into the ONNX simplified model by using onnx-simplifier, when I try to convert to Tensorflow it just crashes with error: "Invalid value in tensor used for shape: -3" inside a tf.slice() operation and this is bad already.
Anyway, ignoring the ONNX optimization and converting the raw ONNX model to TF and then to TFJS all seems to go well.
But even if the TF model works, the TFJS model just throws this error during execution: "Error: GatherV2: the index value 0 is not in [0, -1] " and this is the error which leads to the "Cannot infer the missing size in [-1, 0] when there are 0 elements" error in my original model.
This is the download link for the models. Clearly the simplified version of the TF model is not here because it crashes before completing the conversion as stated before.
@FabioRomagnolo I took a look at the generated models, it is quite complex, seems ONNX turns constants into sparse representation, is it possible to disable that during conversion?
@FabioRomagnolo I took a look at the generated models, it is quite complex, seems ONNX turns constants into sparse representation, is it possible to disable that during conversion?
That's impossible, you can only change the opset* during conversion and wheter to do or not the constant folding, which does not relate to the problem described before.
*The compatible opset with my model is actually the version 12 because of Squeeze v13 operation not supported by ONNX -> Tensorflow converter
@FabioRomagnolo Any luck finding a solution to this issue?
@FabioRomagnolo Any luck finding a solution to this issue?
Actually no. I succeeded converting the native TensorFlow model to TensorFlow.js.
Hi, @FabioRomagnolo
Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.
The TFJs team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TFJs version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate.
Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow.js space.
Thank you for your support and cooperation.
@gaikwadrahul8 Just for posterity, I solved this issue by using: https://github.com/PINTO0309/onnx2tf
This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.
This issue was closed due to lack of activity after being marked stale for past 7 days.