FasterTransformer [Bugfix] GptJ & GptNeoX batch inference error

[Bugfix] GptJ & GptNeoX batch inference error

Open YZP17121579 opened this issue 2 years ago • 1 comments

trafficstars

GptJ & GptNeoX may generate random outputs when using batch inference mode and no prefix prompt. The problem is caused by the nullptr check in https://github.com/NVIDIA/FasterTransformer/blob/f8e42aac45815c5be92c0915b12b9a6652386e8c/src/fastertransformer/kernels/gpt_kernels.cu#L1064

Aug 11 '23 09:08 YZP17121579

I think this is a duplicate solution of #716 which is more elegant and efficient.

Aug 12 '23 01:08 BasicCoder

FasterTransformer FasterTransformer copied to clipboard

[Bugfix] GptJ & GptNeoX batch inference error

FasterTransformer
FasterTransformer copied to clipboard