tensorrtx icon indicating copy to clipboard operation
tensorrtx copied to clipboard

DETR on TRT8

Open vjsrinivas opened this issue 2 years ago • 8 comments

I'm attempting to get the DETR model working on TensorRT 8. I've added the same macros for TRT8 on YOLOv5, YOLOv3, etc. and updated the deprecated signature call for addMatrixMultiply. After getting through all those related errors, I'm stuck on this:

[07/21/2022-11:34:33] [E] [TRT] 2: [pointWiseV2Helpers.h::createTensorDesc::296] Error Code 2: Internal Error (Assertion tensor.extent.d[j] == 1 failed.)
Build engine successfully!
detr: /home/jetson/models/iris-quant-prune-exp/detr/detr.cpp:626: void BuildDETRModel(unsigned int, nvinfer1::IHostMemory**, const string&, std::__cxx11::string): Assertion `engine != nullptr' failed.

I don't know where to even look to start debugging this. Any advice is appreciated.

vjsrinivas avatar Jul 21 '22 15:07 vjsrinivas

@freedenS Any advice on this?

wang-xinyu avatar Jul 22 '22 03:07 wang-xinyu

first of all, can you try trt7? and if there is no problem with trt7, maybe you should check which layers used in detr have changed in trt8. btw, there is no plugin in detr.

freedenS avatar Jul 22 '22 06:07 freedenS

@freedenS I did try on trt7 and it worked fine. I uploaded the changes to my fork: https://github.com/vjsrinivas/tensorrtx/pull/1/files I'll look into layer changes in trt8 and maybe the output shapes at certain points.

vjsrinivas avatar Jul 22 '22 15:07 vjsrinivas

Investigated the createEngine_r50detr function. Printed out all the tensor dimensions for features, pos_embed, input_proj, flatten, and results. They are identical between TRT7 and TRT8 machines.

I also cut up the network and did a rebuild to see if I can isolate what layer is causing the original error. It seems to be somewhere in mha2 in TransformerDecoderLayer... That's all I've gotten so far.

vjsrinivas avatar Jul 22 '22 16:07 vjsrinivas

find the same problem, maybe useful https://forums.developer.nvidia.com/t/build-engine-error-when-use-pointnet-like-structure-and-tensorrt-8-0-1-6/183569

freedenS avatar Jul 23 '22 14:07 freedenS

Discouraging that it could be just a specific trt version issue. I'll grab some SD cards and try TRT8.2 (current version is 8.01)

vjsrinivas avatar Jul 23 '22 22:07 vjsrinivas

Tried out TensorRT 8.2.1 and it worked. I need to test if the calibrator and etc still work.

vjsrinivas avatar Jul 29 '22 23:07 vjsrinivas

@freedenS On both TensorRT 7 and 8, I'm noticing a lot of high confidences with no confidences lower than ~0.5 even with SCORE_THRES set to 0.001. It also doesn't match detections at the same threshold from original detr.

Is there another parameter to change?

vjsrinivas avatar Jul 30 '22 04:07 vjsrinivas

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 29 '22 07:09 stale[bot]