Bernard Ryhede Bengtsson comments

Results 6 comments of


                                            Bernard Ryhede Bengtsson

NaN values with FP16 TensorRT Inference

#113 Facing a similar issue. You may follow the issue on tensorrt linked in my post

FP16 Accuracy failure of TensorRT 8.6.3 when running trtexec built engine on GPU RTX4090

Thanks for the quick response. > Was there an older version of TRT with which you'd get reasonable FP16 results? But, not with the current version you're using? To clarify...

FP16 Accuracy failure of TensorRT 8.6.3 when running trtexec built engine on GPU RTX4090

Here is the polygraphy log. There were other processes running on the GPU in parallel, which causes the long inference latencies. [polygraphy.log](https://github.com/NVIDIA/TensorRT/files/15443452/polygraphy.log) Snippet that includes pass rate, and comparison of...

FP16 Accuracy failure of TensorRT 8.6.3 when running trtexec built engine on GPU RTX4090

> However, in the log snippet above I'm not seeing --fp16 or --stronglyTyped which means this model runs fp32 only. What log file are you referring to? If it is...

Benchmarking inference of TensorRT 8.6.3 using trtexec on GPU RTX 4090

(.1 & .2) We didn't fix the GPU clock. We only have one other processes running on the GPU: ``` +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type...

Benchmarking inference of TensorRT 8.6.3 using trtexec on GPU RTX 4090

@johnyang-nv I saw your issue [About TensorRT Latency Measure](https://github.com/mit-han-lab/efficientvit/issues/52#issue-2005448153), and thought maybe you had some insight on this?