src/fastertransformer/kernels/decoder_masked_multihead_attention /decoder_masked_multihead_attention_template.hpp:36 open this macro definition, it'll find a build error

Open pengl opened this issue 2 years ago • 0 comments

Branch/Tag/Commit

main

Docker Image Version

nvcr.io/nvidia/pytorch:22.08-py3

GPU name

A10

CUDA Driver

515.65.01

Reproduced Steps

https://github.com/NVIDIA/FasterTransformer/blob/f0b5b8631806aedfbe0d844eb9a32202002dd463/src/fastertransformer/kernels/decoder_masked_multihead_attention/decoder_masked_multihead_attention_template.hpp#L38

open the macro "MMHA_USE_FP32_ACUM_FOR_LOGITS", it'll find compile errors.
how to open the macro? what else need to do more?

### Tasks

Oct 11 '23 08:10 pengl