Zero Zeng comments

Results 581 comments of


                                            Zero Zeng

build TensorRT OSS 8.2.1 aarch64 link errors

@kevinch-nv ^ ^

how to speed up inference with dynamic shape inputs

That‘s because when dynamic shape is enabled, when you specify a new binding shape for a context, at the first inference TRT will have to do a shape inference to...

how to speed up inference with dynamic shape inputs

Just use 1x512 as opt shape when building the engine.

how to speed up inference with dynamic shape inputs

when dynamic shapes is enabled, TRT will select kernel tactics that have the best performance and are suitable for all input shapes between the min shape and the max shape....

how to speed up inference with dynamic shape inputs

your "shorter text shape" != opt shape right? trt only make sure the kernel is able to run with the "shorter text shape" but don't guarantee its performance. only optimize...

how to speed up inference with dynamic shape inputs

Please refer to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#work_dynamic_shapes > Is the input shape will be truncated due to the context shape being smaller? Yes, it always uses the binding shape as the input, e.g....

Polygraphy validation failed for TensorRT BERT model

I'll check this later

Polygraphy validation failed for TensorRT BERT model

@Vinayaks117 I don't have much time now, can you try the latest TRT on your side?

Polygraphy validation failed for TensorRT BERT model

Should be a same issue as https://github.com/NVIDIA/TensorRT/issues/2338. can be fixed with preview feature in TRT 8.5.1 ``` &&&& PASSED TensorRT.trtexec [TensorRT v8501] # trtexec --onnx=model.onnx preview=+fasterDynamicShapes0805 --saveEngine=model_bs16.plan --minShapes=input_ids:1x128,attention_mask:1x128,token_type_ids:1x128 --optShapes=input_ids:16x128,attention_mask:16x128,token_type_ids:16x128 --maxShapes=input_ids:128x128,attention_mask:128x128,token_type_ids:128x128...

fp16 onnx -> fp16 tensorrt mismatched outputs

I can reproduce this with TRT 8.5.0.9. but I can not confirm this is an accuracy bug since I see some pow layers that may amplify the diff. @pranavm-nvidia @ttyio...