Lidar_AI_Solution icon indicating copy to clipboard operation
Lidar_AI_Solution copied to clipboard

Replication of BEVFusion PTQ Error

Open ihaohe opened this issue 1 year ago • 4 comments

Thanks for your nice work! I've reproduced BEVFusion PTQ performace with the model(Resnet50) you provided and the script tools/test-mAP-for-cuda.py following https://github.com/NVIDIA-AI-IOT/Lidar_AI_Solution/blob/master/CUDA-BEVFusion/qat/README.md to regenerate Renset50-PTQ model.

But when I train my own BEVFusionModel-Resnet50 (Got 68.10mAP 71.13NDS like yours) and try to use PTQ ,the PTQ process done successfully,however,the nuScenes eval terminate just like: c75550bf1f5942485a20139dfc86dd3a

I printed the boxes info and found some boxes are "nan"! image

I suspect there is something wrong in the PTQ process,but I've no idea how to debug it,can you give me some suggestions?

ihaohe avatar Aug 07 '23 10:08 ihaohe

@ihaohe Could you please provide the version information for your CUDA and TensorRT?

liuanqi-libra7 avatar Aug 08 '23 01:08 liuanqi-libra7

@ihaohe Could you please provide the version information for your CUDA and TensorRT?

@liuanqi-libra7

CUDA 11.3 TensorRT-8.5.1.7 pytorch-quantization 2.1.2 torch 1.10.1+cu113 mmcv 1.4.0 mmdet 2.20.0

My BEVFusion model is here model . You can use it to reproduce the error I've met. Thanks for your help.

ihaohe avatar Aug 08 '23 02:08 ihaohe

My BEVFusion model Int8 result onnx_int8. Hope it can help you to locate the problem.

ihaohe avatar Aug 08 '23 02:08 ihaohe

@hopef @liuanqi-libra7 I'm sorry to bother you. Is there any progress about this issue?

ihaohe avatar Aug 18 '23 07:08 ihaohe