TensorRT Implementation bug in Disentangled Attention Plugin

Implementation bug in Disentangled Attention Plugin

Open fillmore opened this issue 1 year ago • 4 comments

Description

For the tmp values are precomputed for re-use, tmp is calculated as below: https://github.com/NVIDIA/TensorRT/blob/78245b0ac2af9a208ed02e5257bfd3ee7ae8a88d/plugin/disentangledAttentionPlugin/disentangledKernel.cu#L122 The sequence length dimResult.y is wrongly used as max relative position.

But according to the huggingface implementation and the Microsoft deberta implementation, it should be:

float tmp = (mid - 1) / (logf(dimData1.z - 1) - tmp1) * (1 - kEPSILON);

Environment

TensorRT Version: 8.6

NVIDIA GPU: N/A

NVIDIA Driver Version: N/A

CUDA Version: N/A

CUDNN Version: N/A

Operating System: N/A

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link: N/A

Steps To Reproduce

Commands or scripts: N/A

Have you tried the latest release?: N/A

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): N/A

Oct 09 '23 03:10 fillmore

@samurdhikaru @ttyio ^ ^

Oct 10 '23 09:10 zerollzeng

@zerollzeng could you help to create internal issue for this? thank you!

Dec 12 '23 22:12 ttyio

Filed internal bug 4424499 to track this.

Dec 16 '23 12:12 zerollzeng

Thanks @fillmore . I previously fixed this in FT https://github.com/NVIDIA/FasterTransformer/blob/df4a7534860137e060e18d2ebf019906120ea204/src/fastertransformer/kernels/disentangled_attention_kernels.cu#L268, but forgot to sync to TRT plugin

Feb 21 '24 03:02 symphonylyh

@symphonylyh , any ETA about the fix in TensorRT?

Mar 25 '24 23:03 fillmore

@fillmore Please expect the fix in the next TRT release. Thanks.

Mar 27 '24 22:03 samurdhikaru

this is fixed in main, closing and thanks all!

Jul 23 '24 23:07 ttyio

TensorRT TensorRT copied to clipboard

Implementation bug in Disentangled Attention Plugin

Description

Environment

Relevant Files

Steps To Reproduce

TensorRT
TensorRT copied to clipboard