TensorRT
TensorRT copied to clipboard
DataType Failed failure of TensorRT 10.0.1.6 when running RT-DETRv2 on GPU 1650Ti
Description
I used C++'s Tensor and CUDA APIs to infer the RT-DETRv2 model, and during the runtime, I discovered the following issues,The output information of Tensor's API does not match. When I execute nvinfer1:: ICudaEngine:: getTensorFormatDesc and 'nvinfer1:: ICudaEngine:: getTensorDataType',the code like this
void allocator() {
TensorNum = engine->getNbIOTensors();
// 无论是输入输出,都分配内存
for (int i = 0; i < TensorNum; i++) {
// 获取张量信息
const char* name = engine->getIOTensorName(i);
nvinfer1::Dims dims = engine->getTensorShape(name);
nvinfer1::DataType type = engine->getTensorDataType(name);
// 张量类型
const char* mode_name = engine->getTensorIOMode(name) == nvinfer1::TensorIOMode::kINPUT ? "input" : "output";
// 张量名称
tensor_name.emplace_back(name);
// 图像尺寸
tensor_size.emplace_back(std::make_pair(dims.d[2], dims.d[3]));
// 求输入张量的字节数
int nbytes = perbytes[int(type)];
for (int i = 0; i < dims.nbDims; i++)
nbytes = nbytes * dims.d[i];
tensor_bytes.emplace_back(nbytes);
// cuda分配内存 ,并且将其放入到映射中, 名字 : 内存地址头
name_ptr.insert(std::make_pair(name, safeCudaMalloc(nbytes)));
std::cout
<< " tensor mode : "<< mode_name
<< " , tensor name : " << name
<< " , tensor dim : " << dims.d[0] << " X " << dims.d[1] << " X " << dims.d[2] << " X " << dims.d[3]
<< " , tensor btypes : " << nbytes
<< " , getTensorDataType output type : "<< type_name[int(type)]
<< " , getTensorFormatDesc output description : " << engine->getTensorFormatDesc(name)
<< std::endl;
}
}
I find that the information reported by the two APIs is different, as follows
tensor mode : input , tensor name : images , tensor dim : 1 X 3 X 640 X 640 , tensor btypes : 4915200 , getTensorDataType output type : kFLOAT , getTensorFormatDesc output description : Row major linear FP32 format (kLINEAR)
tensor mode : input , tensor name : orig_target_sizes , tensor dim : 1 X 2 X 0 X 0 , tensor btypes : 16 , getTensorDataType output type : kINT64 , getTensorFormatDesc output description : Row major linear INT8 format (kLINEAR)
tensor mode : output , tensor name : labels , tensor dim : 1 X 300 X 0 X 0 , tensor btypes : 2400 , getTensorDataType output type : kINT64 , getTensorFormatDesc output description : Row major linear INT8 format (kLINEAR)
tensor mode : output , tensor name : boxes , tensor dim : 1 X 300 X 4 X 0 , tensor btypes : 4800 , getTensorDataType output type : kFLOAT , getTensorFormatDesc output description : Row major linear FP32 format (kLINEAR)
tensor mode : output , tensor name : scores , tensor dim : 1 X 300 X 0 X 0 , tensor btypes : 1200 , getTensorDataType output type : kFLOAT , getTensorFormatDesc output description : Row major linear FP32 format (kLINEAR)
look at the tensor orig_target_sizes
,the getTensorDataType
report that data type is kINT64
,but the getTensorFormatDesc
report that data type is int8
,the getTensorFormatDesc
is correct, whenI allocate CUDA memory as the getTensorFormatDesc
report, the model will run directly without any problem,which troubles me a lot because when I use CUDA to allocate memory, I use getTensorDataType
to automatically allocate memory, but now I need to manually allocate myself for different models. Could you please fix it?
Environment
TensorRT Version: 10.0.1.6
NVIDIA GPU: 1650Ti 4G
NVIDIA Driver Version:12.2
CUDA Version:12.2
CUDNN Version:8.8
Operating System: Windows 11
Python Version (if applicable):None
Tensorflow Version (if applicable): None
PyTorch Version (if applicable):None
Baremetal or Container (if so, version):None
Relevant Files
Model link:None
Steps To Reproduce
Commands or scripts: No
Have you tried the latest release?: yet not
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt
): yes