yolact icon indicating copy to clipboard operation
yolact copied to clipboard

convert yolact to ONNX

Open sdimantsd opened this issue 6 years ago • 65 comments

Hello again, I'm try to convert yolact to ONNX with the following code:

weights_path = '/home/ws/DL/yolact/weights/yolact_im700_54_800000.pth'

import torch
import torch.onnx
import yolact
import torchvision

model = yolact.Yolact()

# state_dict = torch.load(weights_path)
# model.load_state_dict(state_dict)

model.load_weights(weights_path)

dummy_input = torch.randn(1, 3, 640, 480)

torch.onnx.export(model, dummy_input, "onnx_model_name.onnx")

error msg:

/home/ws/DL/yolact/yolact.py:256: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  for j, i in product(range(conv_h), range(conv_w)):
/home/ws/DL/yolact/yolact.py:279: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  self.priors = torch.Tensor(prior_data).view(-1, 4)
/home/ws/DL/yolact/yolact.py:279: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  self.priors = torch.Tensor(prior_data).view(-1, 4)
/home/ws/DL/yolact/layers/functions/detection.py:74: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  for batch_idx in range(batch_size):
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-a796dc0eef97> in <module>
     13 dummy_input = torch.randn(1, 3, 700, 700)
     14 
---> 15 torch.onnx.export(model, dummy_input, "onnx_model_name.onnx")

~/.local/lib/python3.6/site-packages/torch/onnx/__init__.py in export(*args, **kwargs)
     23 def export(*args, **kwargs):
     24     from torch.onnx import utils
---> 25     return utils.export(*args, **kwargs)
     26 
     27 

~/.local/lib/python3.6/site-packages/torch/onnx/utils.py in export(model, args, f, export_params, verbose, training, input_names, output_names, aten, export_raw_ir, operator_export_type, opset_version, _retain_param_name, do_constant_folding, strip_doc_string)
    129             operator_export_type=operator_export_type, opset_version=opset_version,
    130             _retain_param_name=_retain_param_name, do_constant_folding=do_constant_folding,
--> 131             strip_doc_string=strip_doc_string)
    132 
    133 

~/.local/lib/python3.6/site-packages/torch/onnx/utils.py in _export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, export_type, example_outputs, propagate, opset_version, _retain_param_name, do_constant_folding, strip_doc_string)
    361                                                         output_names, operator_export_type,
    362                                                         example_outputs, propagate,
--> 363                                                         _retain_param_name, do_constant_folding)
    364 
    365         # TODO: Don't allocate a in-memory string for the protobuf

~/.local/lib/python3.6/site-packages/torch/onnx/utils.py in _model_to_graph(model, args, verbose, training, input_names, output_names, operator_export_type, example_outputs, propagate, _retain_param_name, do_constant_folding, _disable_torch_constant_prop)
    264             model.graph, tuple(args), example_outputs, False, propagate)
    265     else:
--> 266         graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
    267         state_dict = _unique_state_dict(model)
    268         params = list(state_dict.values())

~/.local/lib/python3.6/site-packages/torch/onnx/utils.py in _trace_and_get_graph_from_model(model, args, training)
    223     # training mode was.)
    224     with set_training(model, training):
--> 225         trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
    226 
    227     if orig_state_dict_keys != _unique_state_dict(model).keys():

~/.local/lib/python3.6/site-packages/torch/jit/__init__.py in get_trace_graph(f, args, kwargs, _force_outplace, return_inputs)
    229     if not isinstance(args, tuple):
    230         args = (args,)
--> 231     return LegacyTracedModule(f, _force_outplace, return_inputs)(*args, **kwargs)
    232 
    233 

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    491             result = self._slow_forward(*input, **kwargs)
    492         else:
--> 493             result = self.forward(*input, **kwargs)
    494         for hook in self._forward_hooks.values():
    495             hook_result = hook(self, input, result)

~/.local/lib/python3.6/site-packages/torch/jit/__init__.py in forward(self, *args)
    292         try:
    293             trace_inputs = _unflatten(all_trace_inputs[:len(in_vars)], in_desc)
--> 294             out = self.inner(*trace_inputs)
    295             out_vars, _ = _flatten(out)
    296             torch._C._tracer_exit(tuple(out_vars))

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    489             hook(self, input)
    490         if torch._C._get_tracing_state():
--> 491             result = self._slow_forward(*input, **kwargs)
    492         else:
    493             result = self.forward(*input, **kwargs)

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
    479         tracing_state._traced_module_stack.append(self)
    480         try:
--> 481             result = self.forward(*input, **kwargs)
    482         finally:
    483             tracing_state.pop_scope()

~/DL/yolact/yolact.py in forward(self, x)
    615                 pred_outs['conf'] = F.softmax(pred_outs['conf'], -1)
    616 
--> 617             return self.detect(pred_outs)
    618 
    619 

~/DL/yolact/layers/functions/detection.py in __call__(self, predictions)
     73 
     74             for batch_idx in range(batch_size):
---> 75                 decoded_boxes = decode(loc_data[batch_idx], prior_data)
     76                 result = self.detect(batch_idx, conf_preds, decoded_boxes, mask_data, inst_data)
     77 

RuntimeError: isTensor() ASSERT FAILED at /pytorch/aten/src/ATen/core/ivalue.h:209, please report a bug to PyTorch. (toTensor at /pytorch/aten/src/ATen/core/ivalue.h:209)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f721e0ac441 in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f721e0abd7a in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: <unknown function> + 0x979ad2 (0x7f721d130ad2 in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #3: torch::jit::tracer::getNestedValueTrace(c10::IValue const&) + 0x41 (0x7f721d3939a1 in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #4: <unknown function> + 0xa7651b (0x7f721d22d51b in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #5: <unknown function> + 0xa766db (0x7f721d22d6db in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #6: <unknown function> + 0x457942 (0x7f725d6d2942 in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x130cfc (0x7f725d3abcfc in /home/ws/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #8: _PyCFunction_FastCallDict + 0x35c (0x56204c in /usr/bin/python3)
frame #9: /usr/bin/python3() [0x5a1501]
frame #10: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #11: /usr/bin/python3() [0x5136c6]
frame #12: _PyObject_FastCallKeywords + 0x19c (0x57ec0c in /usr/bin/python3)
frame #13: /usr/bin/python3() [0x4f88ba]
frame #14: _PyEval_EvalFrameDefault + 0x467 (0x4f98c7 in /usr/bin/python3)
frame #15: _PyFunction_FastCallDict + 0xf5 (0x4f4065 in /usr/bin/python3)
frame #16: /usr/bin/python3() [0x5a1481]
frame #17: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #18: /usr/bin/python3() [0x513601]
frame #19: _PyObject_FastCallKeywords + 0x19c (0x57ec0c in /usr/bin/python3)
frame #20: /usr/bin/python3() [0x4f88ba]
frame #21: _PyEval_EvalFrameDefault + 0x467 (0x4f98c7 in /usr/bin/python3)
frame #22: /usr/bin/python3() [0x4f6128]
frame #23: _PyFunction_FastCallDict + 0x2fe (0x4f426e in /usr/bin/python3)
frame #24: /usr/bin/python3() [0x5a1481]
frame #25: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #26: _PyEval_EvalFrameDefault + 0x1851 (0x4facb1 in /usr/bin/python3)
frame #27: /usr/bin/python3() [0x4f6128]
frame #28: _PyFunction_FastCallDict + 0x2fe (0x4f426e in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x5a1481]
frame #30: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #31: _PyEval_EvalFrameDefault + 0x1851 (0x4facb1 in /usr/bin/python3)
frame #32: /usr/bin/python3() [0x4f6128]
frame #33: _PyFunction_FastCallDict + 0x2fe (0x4f426e in /usr/bin/python3)
frame #34: /usr/bin/python3() [0x5a1481]
frame #35: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #36: /usr/bin/python3() [0x513601]
frame #37: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #38: _PyEval_EvalFrameDefault + 0x1851 (0x4facb1 in /usr/bin/python3)
frame #39: /usr/bin/python3() [0x4f6128]
frame #40: _PyFunction_FastCallDict + 0x2fe (0x4f426e in /usr/bin/python3)
frame #41: /usr/bin/python3() [0x5a1481]
frame #42: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #43: _PyEval_EvalFrameDefault + 0x1851 (0x4facb1 in /usr/bin/python3)
frame #44: /usr/bin/python3() [0x4f6128]
frame #45: _PyFunction_FastCallDict + 0x2fe (0x4f426e in /usr/bin/python3)
frame #46: /usr/bin/python3() [0x5a1481]
frame #47: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #48: /usr/bin/python3() [0x513601]
frame #49: PyObject_Call + 0x3e (0x57c2fe in /usr/bin/python3)
frame #50: _PyEval_EvalFrameDefault + 0x1851 (0x4facb1 in /usr/bin/python3)
frame #51: /usr/bin/python3() [0x4f6128]
frame #52: /usr/bin/python3() [0x4f7d60]
frame #53: /usr/bin/python3() [0x4f876d]
frame #54: _PyEval_EvalFrameDefault + 0x1260 (0x4fa6c0 in /usr/bin/python3)
frame #55: /usr/bin/python3() [0x4f7a28]
frame #56: /usr/bin/python3() [0x4f876d]
frame #57: _PyEval_EvalFrameDefault + 0x467 (0x4f98c7 in /usr/bin/python3)
frame #58: /usr/bin/python3() [0x4f6128]
frame #59: /usr/bin/python3() [0x4f7d60]
frame #60: /usr/bin/python3() [0x4f876d]
frame #61: _PyEval_EvalFrameDefault + 0x467 (0x4f98c7 in /usr/bin/python3)
frame #62: /usr/bin/python3() [0x4f6128]
frame #63: /usr/bin/python3() [0x4f7d60]

sdimantsd avatar Jun 23 '19 12:06 sdimantsd

See #59. You'll have to put some elbow grease in if you want to get YOLACT traceable (i.e., exportable to ONNX) since I use a lot of pythonic code. I hear @Wilber529 was able to do it following these steps: https://github.com/dbolya/yolact/issues/59#issuecomment-501609792. You have to rewrite how I pass around variables (dictionaries are not supported I think) and you'll have to rewrite anything after Yolact's forward function (starting with self.detect) in your target language because I wrote it in a super pythonic way to make the model faster.

dbolya avatar Jun 23 '19 13:06 dbolya

Hi @dbolya thanks for you'r answer! I'm not sure I understood you, Can you please expand?

sdimantsd avatar Jun 23 '19 15:06 sdimantsd

Yolact does not support conversion to ONNX, which is why you get an error. You'd need to change a lot of things to get conversion to ONNX to work, as outlined by @Wilber529 in https://github.com/dbolya/yolact/issues/59#issuecomment-501609792. I'm not making these changes to the main branch because they'd make the Python version run slower and make it harder to develop.

dbolya avatar Jun 23 '19 15:06 dbolya

thx

sdimantsd avatar Jul 02 '19 14:07 sdimantsd

I have converted yolact to onnx without Detect part, and also modified some upsampling code. https://github.com/Ma-Dan/yolact/tree/onnx Onnx model can get output of loc, conf, mask and proto, and detect process should be implemented with other methods. I also converted onnx model to CoreML model, 4 custom layers need to be implemented to make it work.

Ma-Dan avatar Jul 12 '19 01:07 Ma-Dan

@Ma-Dan thanks for sharing the reference code ,i shalll look into this process and get back to if i have queries

abhigoku10 avatar Jul 12 '19 09:07 abhigoku10

@Ma-Dan thank you very much for sharing your work. I am wondering, what needs to be implemented to execute the onnx model again. What does this mean? "Onnx model can get output of loc, conf, mask and proto, and detect process should be implemented with other methods." Thanks for your help!

aweissen1 avatar Jul 18 '19 10:07 aweissen1

@Ma-Dan Thank you for your code! I convert the model to onnx ,but the results is different from pytorch outpus,such loc , mask and proto, but conf is same! Do you see the problem?

ABlueLight avatar Jul 18 '19 13:07 ABlueLight

@abhigoku10 actually I just used the onnx branch from Ma-Dan to create an onnx file. Do you get an error while converting?

aweissen1 avatar Jul 18 '19 14:07 aweissen1

@aweissen1 i was facing some package issues i shall look into to more in depth and solve it , where there any difference i the output generated

abhigoku10 avatar Jul 18 '19 15:07 abhigoku10

@Ma-Dan Hi, i convert to onnx ssuccessfully,but i found results is not corrent . can you share the version of the pytorch and onnxruntime are you using? Thx

ABlueLight avatar Jul 19 '19 12:07 ABlueLight

@Ma-Dan Can you give more information about the package dependencies for your Yolact-ONNX implementation? And also, have compared the results of Yolact and that of your Yolact-ONNX implementations? If so, please give us insight on it.

sicarioakki avatar Jul 22 '19 06:07 sicarioakki

@ABlueLight and @aweissen1 should us the base code given by @Ma-Dan and train the model , or just load the trained model with this code what is the command to be used . Please share the process Can i run it on gpu how much fps r u getting

abhigoku10 avatar Jul 22 '19 07:07 abhigoku10

i convert to onnx successfully and results is correct, today. Thx @Ma-Dan @sicarioakki @abhigoku10 My package dependencies include pytorch1.0.0 torchvision0.2.1 onnx-tf1.3.0 onnxruntime0.4.0 onnx1.5.0 tensorflow-gpu1.14.0. Just use @Ma-Dan code is ok , i don't modify the codes,just replaced my trained model. Mayby the package version is a Important factors.

ABlueLight avatar Jul 22 '19 10:07 ABlueLight

@ABlueLight after conversion to onnx which platform are you going to deploy it . and did u convert to tensorflow based model

abhigoku10 avatar Jul 22 '19 12:07 abhigoku10

@abhigoku10 TensorFlow and it can run correctly

ABlueLight avatar Jul 22 '19 13:07 ABlueLight

Sorry for the delayed reply, I just fixed code on my repo to use correct onnx output. https://github.com/Ma-Dan/yolact/commit/a0648974369762445bc2095c2318f3f5c7fb7297 The previous version move prior constant output to a separate file to make CoreML file correct, and I forgot to fix onnx output index. Sorry again! And also notice that to make conversion to onnx correct, I hard coded sizes here. https://github.com/Ma-Dan/yolact/blob/onnx/yolact.py#L344 So this code could not work correctly on yolact_im700_54_800000.pth weight, you need to fix the size here.

Ma-Dan avatar Jul 22 '19 13:07 Ma-Dan

The environment I used: onnx 1.4.1 onnxruntime 0.4.0 torch 1.0.1 torchvision 0.2.1

Run python eval.py --trained_model=weights/yolact_darknet53_54_800000.pth --score_threshold=0.3 --top_k=100 --cuda=False --image=dog.jpg to generate onnx file. And run python onnxeval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.3 --top_k=100 --cuda=False --image=dog.jpg to evaluate with onnx.

Ma-Dan avatar Jul 22 '19 13:07 Ma-Dan

@Ma-Dan thanks for the response, i have few queries

  1. the onnx model which you obtained can be used with C++
  2. can i convert that model to other framework like tf or caffe
  3. In the command shared u have mentioned as --cuda=False , does it mean that it can run only on CPU and not on GPU , i wanted to run it on GPU

abhigoku10 avatar Jul 22 '19 15:07 abhigoku10

@Ma-Dan Thank you! Great job.

aweissen1 avatar Jul 22 '19 15:07 aweissen1

@ABlueLight how did you import it to Tensorflow?

ridasalam avatar Jul 22 '19 20:07 ridasalam

file_name= yolact_base_0_4000.onnx params= ['yolact', 'base', '0', '4000'] model_name= yolact_base epoch= 0 iteration= 4000 Config not specified. Parsed yolact_base_config from the file name.

Loading model...Traceback (most recent call last): File "onnxeval.py", line 1035, in net.load_weights(args.trained_model) File "/home/aeye/yolact-onnx/yolact_onnx_1/yolact.py", line 469, in load_weights state_dict = torch.load(path, map_location='cpu') File "/home/aeye/yolact-onnx/Yolact_ONNX/lib/python3.6/site-packages/torch/serialization.py", line 368, in load return _load(f, map_location, pickle_module) File "/home/aeye/yolact-onnx/Yolact_ONNX/lib/python3.6/site-packages/torch/serialization.py", line 532, in _load magic_number = pickle_module.load(f) _pickle.UnpicklingError: invalid load key, '\x08'.

I was able to covert the model to .onnx format. But while inferencing, i am facing the above issue.

sicarioakki avatar Jul 23 '19 07:07 sicarioakki

@ABlueLight how did you import it to Tensorflow? https://github.com/onnx/onnx-tensorflow

ABlueLight avatar Jul 23 '19 07:07 ABlueLight

@aweissen1 @ABlueLight hi guys , i am facing the same issue as above in my inference after conversion

file_name= yolact_base_0_4000.onnx params= ['yolact', 'base', '0', '4000'] model_name= yolact_base epoch= 0 iteration= 4000 Config not specified. Parsed yolact_base_config from the file name.

Loading model...Traceback (most recent call last): File "onnxeval.py", line 1035, in net.load_weights(args.trained_model) File "/home/aeye/yolact-onnx/yolact_onnx_1/yolact.py", line 469, in load_weights state_dict = torch.load(path, map_location='cpu') File "/home/aeye/yolact-onnx/Yolact_ONNX/lib/python3.6/site-packages/torch/serialization.py", line 368, in load return _load(f, map_location, pickle_module) File "/home/aeye/yolact-onnx/Yolact_ONNX/lib/python3.6/site-packages/torch/serialization.py", line 532, in _load magic_number = pickle_module.load(f) _pickle.UnpicklingError: invalid load key, '\x08'.

Any suggestions

abhigoku10 avatar Jul 23 '19 08:07 abhigoku10

@Ma-Dan @aweissen1 @ABlueLight How are guys able to load the ONNX model using torch.load() function? Only onnx.load() can be used right?

sicarioakki avatar Jul 23 '19 09:07 sicarioakki

@ABlueLight, do you have a huge difference in speed of inference?

I used @Ma-Dan 's helpful work to generate yolact.onnx and I load it through onnx load and onnx_tf.backend import prepare. All other post processing is still torch based. It takes 2 mins per image inference (compared to a couple of seconds in Pytorch)

Also, were you able to convert it to pure Tensorflow? (use Tensorflow pb file instead of onnx)

ridasalam avatar Jul 23 '19 15:07 ridasalam

@ridasalam I convert it to pure tensorflow and it const about 400~500ms on i5 cpu。 On GPU,pytorch and tensorflow cost time are almost equal.

ABlueLight avatar Jul 30 '19 03:07 ABlueLight

@sicarioakki ONNX model should be loaded by onnx.load(),i think..

ABlueLight avatar Jul 30 '19 03:07 ABlueLight

@ridasalam I convert it to pure tensorflow and it const about 400~500ms on i5 cpu。 On GPU,pytorch and tensorflow cost time are almost equal.

Can you share the project of tensorflow?

sdimantsd avatar Nov 14 '19 10:11 sdimantsd

The environment I used: onnx 1.4.1 onnxruntime 0.4.0 torch 1.0.1 torchvision 0.2.1

Run python eval.py --trained_model=weights/yolact_darknet53_54_800000.pth --score_threshold=0.3 --top_k=100 --cuda=False --image=dog.jpg to generate onnx file. And run python onnxeval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.3 --top_k=100 --cuda=False --image=dog.jpg to evaluate with onnx.

Hi, thanks for your codes, but I did not find the codes to generate onnx in eval.py, could anybody share the link?

JINGTING92 avatar Nov 19 '19 10:11 JINGTING92