Zero Zeng

Results 581 comments of Zero Zeng

The diff between FP32 TRT and FP32 ONNXRT is very close.

I've filed internal bug 3813586 to track this.

This is not a bug. Model has LayeNorm subgraph in it, when running it in fp16 the results differ between ORT and TRT as ORT. This happens because this subgraph's...

Tried to reproduce the issue with TRT 8.4.1.5 using polygraphy: ``` [I] onnxrt-runner-N0-09/07/22-08:16:25 | Completed 1 iteration(s) in 0.1693 ms | Average inference time: 0.1693 ms. [I] Accuracy Comparison |...

I can reproduce this with ``` [I] trt-runner-N0-09/09/22-00:16:40: output | Stats: mean=0.35972, std-dev=0.34652, var=0.12008, median=0.27958, min=0 at (1, 0, 0), max=0.96826 at (0, 0, 1), avg-magnitude=0.35972 [I] ---- Histogram ----...

The issue has been fixed in TRT 8.5, there will be a preview feature to fix this issue, please wait for the 8.5 release coming soon :-)

https://huggingface.co/docs/transformers/model_doc/bert or our [demoBert](https://github.com/NVIDIA/TensorRT/tree/main/demo/BERT)?

I can reproduce this in TRT 8.5.0.9. but the issue is gone when I don't use dynamic shape. ``` [I] trt-runner-N0-09/22/22-00:17:14 | Completed 1 iteration(s) in 17.49 ms | Average...