Zero Zeng comments

Results 580 comments of


                                            Zero Zeng

Subnormal FP16 value detected

> There is still a little different, the absolute differences between FP32 and FP16( |FP32 - FP16|) are as 2.08711E-08、2.73753E-05、4.54187E-05. it's expected, the diff is very small. > "[E] [TRT]...

Subnormal FP16 value detected

@zll0000 You met the same error? Can you share your onnx here so that I can take a check? thanks！

why plugin？

the tool is public now: https://github.com/NVIDIA-AI-IOT/tensorrt_plugin_generator

There are something wrong in WSL2 with tensorrt example

close this due to being inactive for more than 3 weeks, feel free to reopen it if you have any further questions. Thanks!

Question: Example of using IOutputAllocator with enqueueV3?

We still doesn't support data-dependent plugins. @samurdhikaru for more info :-)

TensorRT 8.5.2.2 GPU AGX Xavier Jetson 5.1 - Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv.)

1. Does other model work? 2. Can you try increase workspace size? 3. If still cannot fix it, please share the onnx here. Thanks!

TensorRT 8.5.2.2 GPU AGX Xavier Jetson 5.1 - Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv.)

I feel like it maybe a env issue, could you please try to flash the latest JP 6.0 and try again? We won't fix bug on TRT 8.5 now.

TensorRT 8.5.2.2 GPU AGX Xavier Jetson 5.1 - Error Code 10: Internal Error (Could not find any implementation for node /model.0/conv/Conv.)

I didn't reproduce the issue with polygraphy? ``` [I] Finished engine building in 434.928 seconds nvidia@tegra-ubuntu:~/scratch.zeroz_sw/github_bug/3545$ polygraphy convert last.onnx --int8 -o out.plan ```

use tensorrt inference bert, speed slow than onnxruntime

How many iteration you are using, first few iteration will take longer time due to warm up GPU and initialization. I would highly recommend that use our trtexec tool to...

converting to TensorRT barely increases performance

Normally it's cause by how your measure the perf, could you please try get a perf summary using trtexec? usage would be like `trtexec --onnx=model.onnx`, and check `GPU compute time`