TensorRT
TensorRT copied to clipboard
[runner.cpp::executeMyelinGraph::715] Error Code 1: Myelin ([myelinGraphExecute] Called without resolved dynamic shapes.)
Description
I converted a onnx model into a tensorrt engine, and i used c++ api on T600 GPU, but when i execute bool status=context_ -> ExecuteV2 (buffers. getDeviceBindings(). data()); Report an error :[runner.cpp::executeMyelinGraph::715] Error Code 1: Myelin ([myelinGraphExecute] Called without resolved dynamic shapes.).And my onnx compilation file is complete.
More information is as follows:
[11/15/2023-11:38:41] [I] [TRT] Loaded engine size: 50 MiB
[11/15/2023-11:38:41] [V] [TRT] Deserialization required 23795 microseconds.
[11/15/2023-11:38:41] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +44, now: CPU 0, GPU 88 (MiB)
[11/15/2023-11:38:41] [V] [TRT] Total per-runner device persistent memory is 13920256
[11/15/2023-11:38:41] [V] [TRT] Total per-runner host persistent memory is 69152
[11/15/2023-11:38:41] [V] [TRT] Allocated activation device memory of size 611810304
[11/15/2023-11:38:41] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +596, now: CPU 0, GPU 684 (MiB)
[11/15/2023-11:38:41] [V] [TRT] CUDA lazy loading is enabled.
[11/15/2023-11:38:41] [I] [TRT] [MS] Running engine with multi stream info
[11/15/2023-11:38:41] [I] [TRT] [MS] Number of aux streams is 1
[11/15/2023-11:38:41] [I] [TRT] [MS] Number of total worker streams is 2
[11/15/2023-11:38:41] [I] [TRT] [MS] The main stream provided by execute/enqueue calls is the first worker stream
[11/15/2023-11:38:41] [V] [TRT] Total per-runner device persistent memory is 1025024
[11/15/2023-11:38:41] [V] [TRT] Total per-runner host persistent memory is 787408
[11/15/2023-11:38:41] [V] [TRT] Allocated activation device memory of size 35127296
[11/15/2023-11:38:41] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +35, now: CPU 1, GPU 719 (MiB)
[11/15/2023-11:38:41] [V] [TRT] CUDA lazy loading is enabled.
[11/15/2023-11:38:41] [I] [TRT] [MS] Running engine with multi stream info
[11/15/2023-11:38:41] [I] [TRT] [MS] Number of aux streams is 1
[11/15/2023-11:38:41] [I] [TRT] [MS] Number of total worker streams is 2
[11/15/2023-11:38:41] [I] [TRT] [MS] The main stream provided by execute/enqueue calls is the first worker stream
[11/15/2023-11:38:41] [V] [TRT] Total per-runner device persistent memory is 0
[11/15/2023-11:38:41] [V] [TRT] Total per-runner host persistent memory is 9472
[11/15/2023-11:38:41] [V] [TRT] Allocated activation device memory of size 116929536
[11/15/2023-11:38:42] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +111, now: CPU 1, GPU 830 (MiB)
[11/15/2023-11:38:42] [V] [TRT] CUDA lazy loading is enabled.
[11/15/2023-11:38:42] [E] [TRT] 1: [runner.cpp::executeMyelinGraph::715] Error Code 1: Myelin ([myelinGraphExecute] Called without resolved dynamic shapes.)
and i execute setBindingDimensions() for my input tensor,Some example codes are as follows
const int keypoints_0_index = engine_->getBindingIndex(lightglue_config_.input_tensor_names[0].c_str());
const int keypoints_1_index = engine_->getBindingIndex(lightglue_config_.input_tensor_names[1].c_str());
const int descriptors_0_index = engine_->getBindingIndex(lightglue_config_.input_tensor_names[2].c_str());
const int descriptors_1_index = engine_->getBindingIndex(lightglue_config_.input_tensor_names[3].c_str());
const int output_match_index = engine_->getBindingIndex(lightglue_config_.output_tensor_names[0].c_str());
const int output_score_index = engine_->getBindingIndex(lightglue_config_.output_tensor_names[1].c_str());
context_->setBindingDimensions(keypoints_0_index, Dims3(1, features0.cols(), 2));
context_->setBindingDimensions(keypoints_1_index, Dims3(1, features1.cols(), 2));
context_->setBindingDimensions(descriptors_0_index, Dims3(1, features0.cols(), 256));
context_->setBindingDimensions(descriptors_1_index, Dims3(1, features1.cols(), 256));
keypoints_0_dims_ = context_->getBindingDimensions(keypoints_0_index);
keypoints_1_dims_ = context_->getBindingDimensions(keypoints_1_index);
descriptors_0_dims_ = context_->getBindingDimensions(descriptors_0_index);
descriptors_1_dims_ = context_->getBindingDimensions(descriptors_1_index);
std::cout<<" "<<keypoints_0_dims_<<keypoints_1_dims_<<descriptors_0_dims_<<descriptors_1_dims_
<<" "<<output_match_dims_<<" "<<output_score_dims<<std::endl;
The printout result is
(1, 605, 2)(1, 613, 2)(1, 605, 256)(1, 613, 256) (0, 2) (0)
then
if (!process_input(buffers, norm_keypoints0, norm_keypoints1)) {
return false;
}
buffers.copyInputToDevice();
bool status = context_->executeV2(buffers.getDeviceBindings().data());
if (!status) {
std::cout<<" infer failed! "<<output_match_dims_<<std::endl;
return false;
}
Report an error here
Environment
TensorRT Version: 8.6.1
NVIDIA GPU: T600
NVIDIA Driver Version: 535.113.01
CUDA Version: 11.8
CUDNN Version: 8.9.4.25
Operating System: ubuntu20.04
The onnx can be found here onnx, thanks
I've requested access.
Could you please try this with trtexec? like trtexec --onnx=model.onnx.
If possible, it would be great if you can test latest 9.1 release or on other newer GPUs.
Thanks for your reply. I will try to solve it. Another similar onnx file is here.onnx
@001SCH Hello! How was your problem solved?
@001SCH Hello! How was your problem solved, i encounter a same problem.
@001SCH