Haitian Jiang

Results 3 comments of Haitian Jiang

Same issue, TE 2.1.0, torch 2.5.1+cu124, cuda 12.4, cudnn 9.8.0. TE 1.13.0 works fine with my environment.

The environment variable `NVTE_BATCH_MHA_P2P_COMM` needs to be set as 1, then this error will not occur. See the transformer_engine code here: https://github.com/NVIDIA/TransformerEngine/blob/303c6d16203b3cb01675f7adb7c21956f140e0ee/transformer_engine/pytorch/attention.py#L1869