mmdetection-to-tensorrt icon indicating copy to clipboard operation
mmdetection-to-tensorrt copied to clipboard

int8 model convert error

Open Chen-cyw opened this issue 4 years ago • 4 comments

env: GPU:tesla t4 nvidia-diver:450.51.6 cuda:11.03 cudnn:8.04 tensorrt:7.1.3.4 pytorch:1.6/1.7 torchvision:0.7/0.8 mmdetection:2.7

Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.
[TensorRT] INFO: Starting Calibration.
[TensorRT] ERROR: engine.cpp (936) - Cuda Error in executeInternal: 700 (an illegal memory access was encountered)
[TensorRT] ERROR: FAILED_EXECUTION: std::exception
[TensorRT] ERROR: ../rtSafe/safeRuntime.cpp (32) - Cuda Error in free: 700 (an illegal memory access was encountered)
terminate called after throwing an instance of 'nvinfer1::CudaError'
  what():  std::exception
Aborted

detail with cuda-memcheck :

========= Invalid __global__ read of size 4
=========     at 0x00000960 in void allClassNMS_kernel<float, float, int=2>(int, int, int, int, float, bool, bool, float*, float*, int*, float*, int*, bool)
=========     by thread (288,0,0) in block (0,0,0)
=========     Address 0x7f2751b0d890 is out of bounds
=========     Device Frame:void allClassNMS_kernel<float, float, int=2>(int, int, int, int, float, bool, bool, float*, float*, int*, float*, int*, bool) (void allClassNMS_kernel<float, float, int=2>(int, int, int, int, float, bool, bool, float*, float*, int*, float*, int*, bool) : 0x960)
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame:/lib64/libcuda.so (cuLaunchKernel + 0x34e) [0x2d725e]
=========     Host Frame:/data/cyw/mmdet2trt/amirstan_plugin/build/lib/libamirstan_plugin.so [0x7ea0b]
=========     Host Frame:/data/cyw/mmdet2trt/amirstan_plugin/build/lib/libamirstan_plugin.so [0xc0751]
=========     Host Frame:/data/cyw/mmdet2trt/amirstan_plugin/build/lib/libamirstan_plugin.so (_Z18allClassNMS_kernelIffLi2EEviiiifbbPT0_PT_PiS3_S4_b + 0x1ea) [0x5e4ea]
=========     Host Frame:/data/cyw/mmdet2trt/amirstan_plugin/build/lib/libamirstan_plugin.so (_Z15allClassNMS_gpuIffE14pluginStatus_tP11CUstream_stiiiifbbPvS3_S3_S3_S3_b + 0x14d) [0x5d2dd]
=========     Host Frame:/data/cyw/mmdet2trt/amirstan_plugin/build/lib/libamirstan_plugin.so (_Z12nmsInferenceP11CUstream_stiiibiiiiiffN8nvinfer18DataTypeEPKvS2_S4_PvS5_S5_S5_S5_bbb + 0x2c8) [0x5cdb8]
=========     Host Frame:/data/cyw/mmdet2trt/amirstan_plugin/build/lib/libamirstan_plugin.so (_ZN8amirstan6plugin22BatchedNMSPluginCustom7enqueueEPKN8nvinfer116PluginTensorDescES5_PKPKvPKPvSA_P11CUstream_st + 0x74) [0x57694]
=========     Host Frame:/data/cyw/TensorRT-7.1.3.4/lib/libnvinfer.so.7 (_ZNK8nvinfer12rt4cuda24PluginV2DynamicExtRunner7executeERKNS0_13CommonContextERKNS0_19ExecutionParametersE + 0x3e4) [0x76b5a4]
=========     Host Frame:/data/cyw/TensorRT-7.1.3.4/lib/libnvinfer.so.7 (_ZN8nvinfer12rt16ExecutionContext15enqueueInternalEPP10CUevent_st + 0x4a7) [0x6f7287]
=========     Host Frame:/data/cyw/TensorRT-7.1.3.4/lib/libnvinfer.so.7 (_ZN8nvinfer12rt16ExecutionContext9enqueueV2EPPvP11CUstream_stPP10CUevent_st + 0x1fc) [0x6f8f0c]
=========     Host Frame:/data/cyw/miniconda/envs/open-mmlab1.6/lib/python3.7/site-packages/tensorrt/tensorrt.so [0x9ece6]
=========     Host Frame:/data/cyw/miniconda/envs/open-mmlab1.6/lib/python3.7/site-packages/tensorrt/tensorrt.so [0xd91e4]
=========     Host Frame:python (_PyMethodDef_RawFastCallKeywords + 0x274) [0x165914]
=========     Host Frame:python (_PyCFunction_FastCallKeywords + 0x21) [0x165a31]
=========     Host Frame:python (_PyEval_EvalFrameDefault + 0x52fe) [0x1d239e]
=========     Host Frame:python (_PyEval_EvalCodeWithName + 0x2f9) [0x114829]
=========     Host Frame:python (_PyFunction_FastCallDict + 0x1d5) [0x115925]
=========     Host Frame:python (_PyObject_Call_Prepend + 0x63) [0x1344d3]
=========     Host Frame:python (PyObject_Call + 0x6e) [0x126ffe]
=========     Host Frame:python (_PyEval_EvalFrameDefault + 0x1e4a) [0x1ceeea]
=========     Host Frame:python (_PyEval_EvalCodeWithName + 0x2f9) [0x114829]
=========     Host Frame:python (_PyFunction_FastCallDict + 0x1d5) [0x115925]
=========     Host Frame:python (_PyObject_Call_Prepend + 0x63) [0x1344d3]
=========     Host Frame:python [0x16be1a]
=========     Host Frame:python (_PyObject_FastCallKeywords + 0x48b) [0x16cccb]
=========     Host Frame:python (_PyEval_EvalFrameDefault + 0x49e6) [0x1d1a86]
=========     Host Frame:python (_PyFunction_FastCallKeywords + 0xfb) [0x164e7b]
=========     Host Frame:python (_PyEval_EvalFrameDefault + 0x416) [0x1cd4b6]
=========     Host Frame:python (_PyFunction_FastCallKeywords + 0xfb) [0x164e7b]
=========     Host Frame:python (_PyEval_EvalFrameDefault + 0x416) [0x1cd4b6]
=========     Host Frame:python (_PyFunction_FastCallKeywords + 0xfb) [0x164e7b]
=========     Host Frame:python (_PyEval_EvalFrameDefault + 0x416) [0x1cd4b6]
=========     Host Frame:python (_PyEval_EvalCodeWithName + 0x2f9) [0x114829]
=========     Host Frame:python (PyEval_EvalCodeEx + 0x44) [0x115714]
=========     Host Frame:python (PyEval_EvalCode + 0x1c) [0x11573c]
=========     Host Frame:python [0x22cf14]
=========     Host Frame:python (PyRun_FileExFlags + 0xa1) [0x237331]
=========     Host Frame:python (PyRun_SimpleFileExFlags + 0x1c3) [0x237523]
=========     Host Frame:python [0x238655]
=========     Host Frame:python (_Py_UnixMain + 0x3c) [0x23877c]
=========     Host Frame:/lib64/libc.so.6 (__libc_start_main + 0xf5) [0x223d5]
=========     Host Frame:python [0x1dcff0]
=========
#assertion/data/cyw/mmdet2trt/amirstan_plugin/src/plugin/batchedNMSPlugin/batchedNMSPlugin.cpp,140
========= Error: process didn't terminate successfully
========= No CUDA-MEMCHECK results found

Chen-cyw avatar Dec 25 '20 12:12 Chen-cyw

Hi Could you provide the model config?

grimoire avatar Dec 26 '20 11:12 grimoire

@grimoire cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco.txt i can convert it in cuda=10.2 cudnn=8.03 tensorrt=7.2.1 pytorch=1.6 GPU=1080

Chen-cyw avatar Dec 26 '20 12:12 Chen-cyw

Hi I can convert on cuda=10.2 tensorrt=7.1.3.4 GPU=1080ti. Guess it is related to gpu type or cuda version. Sorry, I don't have any solution for now. I will keep tracing on this. You can also post a thread about this on nvidia forums see if they have any advice.

grimoire avatar Dec 27 '20 14:12 grimoire

Thanks, if any sllution, I will report here.

Chen-cyw avatar Dec 28 '20 02:12 Chen-cyw