DeDoDe-ONNX-TensorRT icon indicating copy to clipboard operation
DeDoDe-ONNX-TensorRT copied to clipboard

Are you interested in doing an onnx export for other work?

Open GavinYang5 opened this issue 1 year ago • 9 comments

Thank you so much for the great work you do I'm currently trying to export this work to the onnx model https://github.com/qinzheng93/GeoTransformer I'm having trouble with the conversion It's been a long time since the original author maintained it, I know you've done a great job with Torch2Onnx If you can help me, I would appreciate it.

GavinYang5 avatar Jan 05 '24 10:01 GavinYang5

Hi @GhYang0519, thank you for your interest.

I'd be glad to help. What problems are you facing when exporting GeoTransformer to ONNX?

fabio-sim avatar Jan 05 '24 15:01 fabio-sim

I'm so glad you got back to me image I'm having this issue when exporting an onnx model

GavinYang5 avatar Jan 05 '24 15:01 GavinYang5

I am a beginner in torch exporting onnx I tried to find a case about exporting ONNX models from Torch models for point cloud registration, but I failed I am trying to find this error, and some online opinions suggest avoiding using torch.bool type Tensors Because GeoTransformer uses a large number of torch.bool type tensors as indexes, and uses dict as input in forward() (isn't it unreasonable?)

GavinYang5 avatar Jan 05 '24 15:01 GavinYang5

I've seen this error before (see this related stackoverflow issue). I believe this occurs in the following lines:

https://github.com/qinzheng93/GeoTransformer/blob/e7a135af4c318ff3b8d7f6c963df094d7e4ea540/geotransformer/modules/ops/pointcloud_partition.py#L87-L88

node_masks = torch.zeros(nodes.shape[0], dtype=torch.bool).cuda()  # (M,)
node_masks.index_fill_(0, point_to_node, True)

A possible solution is to change them to:

node_masks = torch.zeros(nodes.shape[0]).cuda()  # (M,)
node_masks.index_fill_(0, point_to_node, 1)

fabio-sim avatar Jan 05 '24 15:01 fabio-sim

Yes, I noticed the issue you mentioned, I have made as many modifications as possible to avoid using torch.bool,But the error still exists TAT

GavinYang5 avatar Jan 05 '24 15:01 GavinYang5

I see.. Then perhaps GeoTransformer is too complicated to convert to ONNX

fabio-sim avatar Jan 05 '24 15:01 fabio-sim

The model I want to export is this https://github.com/qinzheng93/GeoTransformer/blob/e7a135af4c318ff3b8d7f6c963df094d7e4ea540/experiments/geotransformer.3dmatch.stage4.gse.k3.max.oacl.stage2.sinkhorn/model.py#L19 Do you think there is any good way to gradually investigate which part of the problem is causing it? Currently, I can only receive an error message, but I don't know exactly where it was caused, so I am at a loss as to how to troubleshoot the problem

GavinYang5 avatar Jan 05 '24 15:01 GavinYang5

Yes, if you're using VSCode to debug the export process, you can place breakpoints in the forward passes of modules, starting from the beginning and gradually advancing further until VSCode hits that error instead of a breakpoint. That should give you an idea of the location of the problematic op.

fabio-sim avatar Jan 05 '24 15:01 fabio-sim

Okay, thank you very much for your valuable feedback. I am ready to continue trying it out

GavinYang5 avatar Jan 05 '24 15:01 GavinYang5