TensorRT
TensorRT copied to clipboard
Fused_MHA does not support when seq_len = 1024, Dh==72, causal_mask==false
When I use FMHA_v2, I found it does not support my scenes. So i wonder is there any way to use fmha_v2 except changing model. Thx a lot.