yolov7 icon indicating copy to clipboard operation
yolov7 copied to clipboard

Instance segmentation at U7,onnx how to Infer?

Open BandyKenny opened this issue 2 years ago • 2 comments

I use the path:\yolov7-u7\seg\export.py and conv~onnx success! but i see the output can not to infer image

BandyKenny avatar Sep 16 '22 03:09 BandyKenny

If you check code at seg/models.common.py you can find that onnx return only first output from model.

        elif self.onnx:  # ONNX Runtime
            im = im.cpu().numpy()  # torch to numpy
            y = self.session.run([self.session.get_outputs()[0].name], {self.session.get_inputs()[0].name: im})[0]

jiugary avatar Sep 17 '22 22:09 jiugary

I change common file like yolov5 and it works DetectMultiBackend init()

        elif onnx:  # ONNX Runtime
            LOGGER.info(f'Loading {w} for ONNX Runtime inference...')
            cuda = torch.cuda.is_available() and device.type != 'cpu'
            check_requirements(('onnx', 'onnxruntime-gpu' if cuda else 'onnxruntime'))
            import onnxruntime
            providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
            session = onnxruntime.InferenceSession(w, providers=providers)
            output_names = [x.name for x in session.get_outputs()]
            meta = session.get_modelmeta().custom_metadata_map  # metadata
            if 'stride' in meta:
                stride, names = int(meta['stride']), eval(meta['names'])

and in forward()

        elif self.onnx:  # ONNX Runtime
            im = im.cpu().numpy()  # torch to numpy
            y = self.session.run(self.output_names, {self.session.get_inputs()[0].name: im})
            pred, *others, proto = [torch.tensor(i, device=self.device) for i in y] # return to torch
            y = (pred, (others, proto)) # change output shape like `pt` output

jiugary avatar Sep 17 '22 23:09 jiugary

I have similar ONNX: image

What do those outputs mean? Are they to be removed in reparameterization (not working for segmentation)

V4A001 avatar Dec 15 '22 12:12 V4A001

I am getting the same onnx model. The thing is that the predict.py script with @jiugary 's patch seems to correctly understand those output layers. Does any one know how to parse them?

aurelm95 avatar Dec 16 '22 11:12 aurelm95

I am trying to get this example to run. Problems with the input which seems to be dynamic: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: images for the following indices index: 2 Got: 384 Expected: 640 Please fix either the inputs or the model.

So I created an dynamic input. Not sure if that makes it more complicated. So I do not get to the output yet. image

V4A001 avatar Dec 18 '22 17:12 V4A001

Not going the right direction: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Mul node. Name:'/model.105/Mul_26' Status Message: D:\a\_work\1\s\onnxruntime\core/providers/cpu/math/element_wise_ops.h:523 onnxruntime::BroadcastIterator::Append axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 12 by 20

V4A001 avatar Dec 18 '22 17:12 V4A001

@aurelm95 My experience is C# code with high abstraction layer. From what I can understand from the Python code is that the ONNX model is also loaded by predict. The class DetectMultiBackend(nn.Module) can create an instance of type 'ONNX'. Then the dataloader loads streams or images...or a combination. Then the inherence is called on dt[0], dt[1], dt[2]. The last holds the predictions. Then those predictions are enumerated and per image an annotator is processed ..and masks...I am not that good at Python. I tried with another example: https://github.com/ibaiGorordo/ONNX-TopFormer-Semantic-Segmentation

That is much more coded with abstraction, but still a lot of Python I do not understand. So basically the yolo7 and yolo5 have the code in their predict to output the masks. So I believe the ONNX must go the same path.

V4A001 avatar Dec 18 '22 22:12 V4A001

@V4A001 after some reasearch, I realized that these wired output layers are the real output layers from the model, there is no bug. The thing is that after inferencing, a NMS algorithm is applied.

https://github.com/WongKinYiu/yolov7/blob/44f30af0daccb1a3baecc5d80eae22948516c579/seg/segment/predict.py#L121-L135

This NMS algorithm is designed to parse these output layers. When we export a 'detection' yolov7 model, this NMS algorithm is added as a new layer to the neural network and hence we can see nice output layer in netron.app.

In our case, the u7branch, we cannot export a model to onnx with the NMS algorithm. We should add it like the code referenced. If we want to parse the output in a further tensorrt model, we should "re-write" this NMS funcion :(

aurelm95 avatar Dec 27 '22 15:12 aurelm95

@aurelm95 yep, we concluded the same. What do dt[0], dt[1] do, are that kind of pivots yes/no? I still have the question open why there are multiple outputs of the model and not just 1 output only.

V4A001 avatar Dec 30 '22 08:12 V4A001