mmdeploy
mmdeploy copied to clipboard
Result divergence of deployed ConvNeXt model
I used the latest MMDeploy tool to convert ConvNeXt into tensorRT backend. It converted successfully. However, when I visualized the results, I found the output from tensorRT backend was totally wrong.
I did checked the export process. I found that the output from intermediate ONNX model was correct. So I guess this issue is from the tensorRT model.
Below is the command and configs I have used.
python tools/deploy.py \
configs/mmseg/segmentation_tensorrt-fp16_static-512x512.py \
./mmsegmentation/configs/convnext/upernet_convnext_tiny_fp16_512x512_160k_ade20k.py \
./mmsegmentation/ckpts/upernet_convnext_tiny_fp16_512x512_160k_ade20k_20220227_124553-cad485de.pth \
./test.jpg \
--work-dir work-dirs
--device cuda:0
segmentation_tensorrt-fp16_static-512x512.py
upernet_convnext_tiny_fp16_512x512_160k_ade20k.py
upernet_convnext_tiny_fp16_512x512_160k_ade20k_20220227_124553-cad485de.pth
@haofanwang Are you testing onnx with FP32 in onnxruntime? Maybe you could try with tensorrt FP32 as well. BTW, we'll check the problems of exporting PyTorch FP16 model to TensorRT FP16/INT8 later.
FP32 works. Please let me know if there any update in FP16, thx.
FP32 works. Please let me know if there any update in FP16, thx.
Hi, Yes, there is a noticeable difference between FP16 PyTorch model and the converted FP16 TensorRT model. Not sure which layer has problems such as numerical overflow. Maybe you could ask suggestions from nvidia repo as well.
This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.
@haofanwang have you solved your problem?
Same here. convnext in a detection model is prone to trt fp16. For convnext v2, the problem is even worse due to GRN. With that, trt fp16 can't detection anything!