INT8 engine building is too slow
Hi everyone,
I faced the problem during the launching the YOLOv4 inference with INT8 precision on RTX 3090 GPU:
the buildEngineWithConfig() method is very slow (had been running for 1.5 hours, when I interrupted the process).
I tried to increase MaxWorkspaceSize from 1MiB (1<<20) to 32, 64, 512 Mib, but unsuccessfully.
~1k images are used for INT8 calibration.
Engines with computing precision FP32, FP16 are built about 1 minute.
Environment:
Ubuntu 20.04
TensorRT 7.2.1
cuda 11.1
cudnn 8
The inference with the same configuration works well on laptop with RTX 2070 (building the INT8 engine takes ~12 minutes).
decrease the calibration images to ~100
decrease the calibration images to ~100
Thank you! It helped.