TensorRT
TensorRT copied to clipboard
Polygraphy [HostToDeviceCopy]requires bool I/O but node can not be handled by Myelin.
Description
polygraphy run best_fp32.onnx --onnxrt --trt --save-engine=best_fp32.engine --workspace 100000000 --atol 1e-3 --rtol 1e-3 --verbose --onnx-outputs mark all --trt-outputs mark all --input-shapes 'images:[1, 3, 640, 768]' --fail-fast >result-run-fp32-markall.txt
Environment
TensorRT Version: 8.2.1 NVIDIA GPU: Jetson NX NVIDIA Driver Version: CUDA Version: 10.2 CUDNN Version: Operating System: ubuntu Python Version (if applicable): Tensorflow Version (if applicable): PyTorch Version (if applicable): 1.9.0 Baremetal or Container (if so, version):
Relevant Files
Steps To Reproduce
polygraphy run best_fp32.onnx --onnxrt --trt --save-engine=best_fp32.engine --workspace 100000000 --atol 1e-3 --rtol 1e-3 --verbose --onnx-outputs mark all --trt-outputs mark all --input-shapes 'images:[1, 3, 640, 768]' --fail-fast >result-run-fp32-markall.txt
Can you provide more details? how the log like and would be better if you can provide the onnx for reproduce.
I only get errors when running with polygraphy. Running it without mark is ok. I need to find the tensors that are not right compared with onnx model. In addition, I want to know how to make output with bisect.
result-run-fp32tofp32-markall.txt Thank you!
mark all is not a good choice, it will break all tensorrt graph fusion and thus might change the inference result. I would suggest run it without mark all first and enable verbose log with -vv, you can see how the layer is fused in the verbose log. then only mark the last node of the fused node as output.
closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!