Zhexin Li

Results 4 comments of Zhexin Li

Hi, I'm new to TensorRT and I can't answer your question. Would you mind telling me how you get the tensorrt engine visualization image? It seems very useful.

Got it! Thanks for your trouble. It helped a lot.

Thanks. Is there any possibility that we can surpass the MHA's seq_len limitation? Or does NV have plan to extend mha_v2 to larger dimention so that diffusion-based model can benefit...

> The INT8 MHA fused kernels are already integrated in TRT 8.6. The only caveat is that SeqLen must be 512 or below. > > It does use flash attention...