yolact icon indicating copy to clipboard operation
yolact copied to clipboard

How do I run forward computing with C++?

Open FightStone opened this issue 6 years ago • 44 comments

Thanks to open source code and outstanding contributions. I would like to ask how to deploy yolact in actual production with C++. I want to convert a .pth model file into a .pt model file. Could you give me some reasonable suggestions? and what is the role of ‘use_jit’ in the configuration file? Thank you so much!

FightStone avatar Jun 10 '19 11:06 FightStone

Do you mean through Pytorch 1.0 script? Doing so would essentially require a total rewrite though, as I use too complex python to be able to easily swap it out for torchscript. You could probably interface with python using the Pytorch C++ bindings, but the model would still be run in python (i.e., slower than in native C++).

And what do you mean by converting a .pth file into a .pt file? If I understand this correctly, .pt is just another name for .pth.

I believe I removed use_jit, but basically I already wrapped the simple parts of the model in torchscript (backbone and FPN). Use_jit just specified whether or not to use those torchscript versions. I added it because torchscript does not play nicely with multiple GPUs, but now I've made it automatically disable JIT when it detects multiple GPUs.

dbolya avatar Jun 10 '19 11:06 dbolya

I want to convert a python generated model file (.pth) into a model file (.pt) that C++ can call.(https://pytorch.org/tutorials/advanced/cpp_export.html#) I don't know which method to use. Could you give me some reasonable suggestions?Thanks for your prompt reply!

FightStone avatar Jun 10 '19 12:06 FightStone

Ah I see. Like I said, porting all of Yolact to torchscript would require rewriting almost everything. You might find more luck with tracing, but I think it requires you to return and input only tensors from functions, and I definitely don't do that (so it would also require some rewriting, but much less than if you took the torchscript route). You can try this section, but there will be errors you'll need to fix (at least I think so).

dbolya avatar Jun 10 '19 19:06 dbolya

@FightStone Maybe you can try this work flow: Pytorch->ONNX->NCNN. I have successfully done it, and test the c++ inference code on my ARM device:)

Wilbur529 avatar Jun 12 '19 12:06 Wilbur529

@Wilbur529
When I try to convert yolact from pytorch to onnx, it complains:

Traceback (most recent call last):
  File "D:\Projects\Python\yolcat_seg\test.py", line 238, in <module>
    torch.onnx.export(model, batch, "yolact.onnx", verbose=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\__init__.py", line 27, in export
    return utils.export(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 104, in export
    operator_export_type=operator_export_type)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 281, in _export
    example_outputs, propagate)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 224, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 192, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 197, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace)(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 253, in forward
    out_vars, _ = _flatten(out)
RuntimeError: Only tuples, lists and Variables supported as JIT inputs, but got dict

Can you please show your detail steps to make it work?

ausk avatar Jun 13 '19 06:06 ausk

@ausk Hi, There are some key points to note:

  1. Turn off JIT;
  2. Just return the practical output value from the network (instead of the post-processing result of the prediction head);
  3. Rewrite some codes to fix the parameters of some operators, just like protonet, FPN;
  4. Decoding output by yourself( post processing, NMS, and so on).

Wilbur529 avatar Jun 13 '19 08:06 Wilbur529

@Wilbur529 Hi, thank you for your advide and patient. I follow your advice to export it into onnx format.

  1. Turn off JIT: set PYTORCH_JIT environment, and remove @torch.jit.script decorator, and delect JITModule、jit_backbone in yolact.py.
  2. return list of Tensors from PredictionModule::forward and Yolact::forward.
  3. delete decoding operations in Yolact::forward

The input shape is:

torch.Size([1, 3, 550, 550])

The outputs shapes look like ok:

torch.Size([1, 19248, 4])
torch.Size([1, 19248, 81])
torch.Size([1, 19248, 32])
torch.Size([1, 138, 138, 32])

Then when calling torch.onnx.export(model, batch, "yolact.onnx", verbose=True), it complains:

  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 211, in _model_to_graph
    assert example_outputs is not None, "example_outputs must be provided when exporting a ScriptModule"
AssertionError: example_outputs must be provided when exporting a ScriptModule

It seems like that the module is running in JIT or Script mode. But I'm sure the environment is set PYTORCH_JIT=0. I just have no idea how to fix it. Have you encounter such an issue?

ausk avatar Jun 14 '19 02:06 ausk

@ausk I turned off JIT by modifying the configure file. https://github.com/dbolya/yolact/blob/13b49d749b734b098a292c8c5226017b344ccc67/data/config.py#L566 Just add this line to your configure dictionary, and change it to True.

Wilbur529 avatar Jun 14 '19 02:06 Wilbur529

@Wilbur529 I forgot to say that I modified the line. I'm still trying to debug it. Thank you for you time. :)

ausk avatar Jun 14 '19 02:06 ausk

Hi @ausk , Did you succeed in converting the weights to ONNX?

sdimantsd avatar Jun 23 '19 15:06 sdimantsd

@Wilbur529 @ausk Hello were you able to successfully convert the yolact model to ONX since i am also facing the same issues during conversion , would you be able to share the converted model and elaborate the steps of conversion that would be much helpful

abhigoku10 avatar Jun 25 '19 04:06 abhigoku10

@abhigoku10 Since the converted model may not help you, maybe you could share the problem description you met with us.

Wilbur529 avatar Jun 27 '19 01:06 Wilbur529

@Wilbur529 Hi, first of all, because of your success, it has given me the confidence to explore. :) :)

Secondly, I modified the code in yolact.py like this: return (pred_outs['loc'], pred_outs['conf'], pred_outs['mask'], pred_outs['proto']) not return self.detect(pred_outs)

Third, I modified the code like this to generate a forward model via Tracing: traced_script_module = torch.jit.trace(net, batch) traced_script_module.save("xxx/model.pt")

Fourth, I success generate a forward model named model.pt.

But, when i do inference like this: std::vectortorch::jit::IValue inputs; inputs.push_back(torch::ones({1, 3, 550, 550}).to(at::kCUDA)); at::Tensor output = module->forward(inputs).toTensor();

There was an error: terminate called after throwing an instance of 'c10::Error' what(): isTensor() INTERNAL ASSERT FAILED at /home/sn19038157/libtorch/include/ATen/core/ivalue_inl.h:119, please report a bug to PyTorch. (toTensor at /home/sn19038157/libtorch/include/ATen/core/ivalue_inl.h:119)

what should I do? I would be grateful if I could help me a little. :( :(

FightStone avatar Jul 01 '19 06:07 FightStone

@FightStone I think you are going to try the pytorch->caffe2->C++ work flow, which i haven't tried. Maybe you can try to update Pytorch to the latest version. But i still suggest you tring Pytorch->ONNX->NCNN, because NCNN is a high-performance NN inference computing framework.

Wilbur529 avatar Jul 01 '19 06:07 Wilbur529

@Wilbur529 Can you share your forward processing code? Or introduce some classic C++ forward processing code examples? Because the relevant information is not a lot.

FightStone avatar Jul 01 '19 07:07 FightStone

@ausk Hi, There are some key points to note:

  1. Turn off JIT;
  2. Just return the practical output value from the network (instead of the post-processing result of the prediction head);
  3. Rewrite some codes to fix the parameters of some operators, just like protonet, FPN;
  4. Decoding output by yourself( post processing, NMS, and so on).

@Wilbur529 can you share the modified part of the code , since step1 of tunoff JIT is done and the other steps its a bit confusing . either would be able to elaborate the changes to be made

abhigoku10 avatar Jul 01 '19 08:07 abhigoku10

@FightStone For pre-processing, it's just a simple normalization. And for post-processing, you could refer to the implementation of NCNN(https://github.com/Tencent/ncnn/blob/master/src/layer/detectionoutput.cpp):)

Wilbur529 avatar Jul 01 '19 09:07 Wilbur529

@abhigoku10 Sry for that i could not show you this part of code because of the company rules. About the second point, you could return a list of these outputs. The third point, since the ONNX framework only supports fixing net parameters, so you need to change some variables to constants. The last one, refer to the post-processing python code of YOLACT or other implementation, and translate them to C/C++. May success wait upon your efforts:)

Wilbur529 avatar Jul 01 '19 09:07 Wilbur529

@Wilbur529 shall try to do this , how much time do u think will it take to do the whole process of conversion . When converted was there any change in the fps values ?

@FightStone please share the code base which you have modified

abhigoku10 avatar Jul 01 '19 10:07 abhigoku10

@abhigoku10 The fps depends on the hardware environment. So only experiment will tell you the answer:)

Wilbur529 avatar Jul 02 '19 01:07 Wilbur529

@Wilbur529 yup rightly said, so for ur hardware env what was the fps u were achieving

abhigoku10 avatar Jul 02 '19 03:07 abhigoku10

@abhigoku10 I run YOLACT(ResNet-101) on my MacBook Pro with 550*550 input size, costing around 1.5s per frame.

Wilbur529 avatar Jul 02 '19 04:07 Wilbur529

@Wilbur529 thanks for the response i am interested to know about mobilenet architecture fps

abhigoku10 avatar Jul 02 '19 05:07 abhigoku10

@Wilbur529 When I try to convert yolact from pytorch to onnx, it complains:

Traceback (most recent call last):
  File "D:\Projects\Python\yolcat_seg\test.py", line 238, in <module>
    torch.onnx.export(model, batch, "yolact.onnx", verbose=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\__init__.py", line 27, in export
    return utils.export(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 104, in export
    operator_export_type=operator_export_type)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 281, in _export
    example_outputs, propagate)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 224, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 192, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 197, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace)(*args, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\Programs\Python\Python37\lib\site-packages\torch\jit\__init__.py", line 253, in forward
    out_vars, _ = _flatten(out)
RuntimeError: Only tuples, lists and Variables supported as JIT inputs, but got dict

Can you please show your detail steps to make it work?

Did you make the Yolact code cpompatible to run in CPU and then you got this error while trying to convert Pytorch model to ONNX format?

ashank-art avatar Jul 02 '19 09:07 ashank-art

@ashank-art Please follow my second key point. Replace the output with a list of tensors that have no post-processing.

Wilbur529 avatar Jul 02 '19 10:07 Wilbur529

Has anyone tried Pytorch to Tensorflow conversion for Yolact?

ridasalam avatar Jul 22 '19 08:07 ridasalam

@ridasalam nope i have not tried tensorflow currently trying with onx , let me knwo if you have tried tf #74

abhigoku10 avatar Jul 22 '19 10:07 abhigoku10

Sup? anything new with the Pytorch to Tensorflow conversion for Yolact?

sdimantsd avatar Nov 14 '19 10:11 sdimantsd

Hi, @FightStone I am getting same INTERNAL ASSERT FAILED .

Any updates?

vatsalkansara2 avatar Jan 15 '20 06:01 vatsalkansara2

@Wilbur529 I change model output from dict to list now, and can get onnx converted file. I am not sure the onnx file is right or not. From the comments you have mentioned above,
I am confused how to do the third point, as I have relevant errors about it. When I convert onnx to mnn, it shows: type=slice, failed, may be some node is not const And I check pytorch to onnx process, the slice op is 873 and 874, it is a slice operation for: scores, idx2 = scores.sort(0, descending=True) idx2 = idx2[:cfg.max_num_detections] scores = scores[:cfg.max_num_detections] (before this line, there is a proposal number k is not a fixed number , input is random variable 5505503 ) Do you have some advice? Thank you. I don't know how to modify this part.

dzyjjpy avatar Feb 14 '20 08:02 dzyjjpy