TransformerEngine
TransformerEngine copied to clipboard
Question about the construction of cu_seqlens_q in ring attention unit test
trafficstars
I've read the issue (https://github.com/NVIDIA/TransformerEngine/issues/1409) regarding the usage of cu_seqlens_q. It seems that I understand how cu_seqlens_q is used. However, I'm confused why cu_seqlens_q[-1] = cu_seqlens_q[-2] in the construction of cu_seqlens_q in ring attn unit test?
https://github.com/NVIDIA/TransformerEngine/blob/a169e9e709d51b34806babd7fa1afaa7ccbfeeb7/tests/pytorch/attention/run_attention_with_cp.py#L190
I'd like to know whether this is related to some specific operations in TE's Ring Attention or Megatron-LM. Can anyone give me some suggestions ?